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PROCEEDINGS 


9:20 a.m. 


DONALD B. RUBIN, PH.D., 
g been first duly sworn on oath. 


s “'•sex amine d and testified as follows: 


EXAMINATION 


BY MR 


HEY: 




bus in 


depar 



Could you state your name and give your 
ddress for the record, please. 

My name is Donald B. Rubin. My primary 
ddress is at Harvard University 
of statistics. One Oxford Street, 


Cambr jdgej Mass. Although I am not here 


r epre 


ng Harvard in any way. 

Professor, my name is Mike Withey and 


I have some questions for you in this case this 
morning. And if you don't understand any question 
or feel it's too vague to answer, will you ask me 
to repeat it or rephrase it? 

A. Sure. 

Q. assume if you answer the question, you 

believe it is clear enough to be able to answer it 
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Donald B. Rubin, Ph.D. 

then. Fair enough? 

A. I didn't hear that. I'm sorry. 

Q* I assume that if you answer the question, 

that it is a clear-enough question that you can 
answer it. Fair enough? 

Yes. 

W You understand that this is a deposition, 
re under oath, and that these questions 
and aI p3£§ :s may be used at the time of trial in 
this cHnf4 

C5 ^Ye S/ I do. 

# And for that reason it is important to be 




1 and accurate as you can? 
’I Yes, I understand. 


as tr 


You have been deposed by myself in the 
Korthv |ii^I'aborers v. Philip Morris case on 
SepteirlPlPPf 2 8 , 1998. Do you recall that? 

A. I don't recall the exact date but 
<■ I recall the telephone deposition. 

Q. And have you had an opportunity to review 
that deposition? 

A. A while ago, yes, I reviewed it and made 
corrections on it and returned the corrections. 

Q. Other than those corrections, if asked 
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Donald B. Rubin, Ph.D. 


the same questions today, would you give generally 
the same answers as you gave then? We could save 
some time in this deposition so I don't have to go 
replow the areas that 1 covered back then. 


A. I believe that's correct, although it is 


pos si 


^■4 


hat my thinking on some topics has been 


re f inopd . 


any t 






I don 


contr 


be si 





As you sit here today can you think of 
s your thinking has been refined on, sir? 
Well, just in a general sort of sense, 
way to go after the model-building 
;*ve done some more thinking about it, an 
link my answers would be in any way 
-ory to what I said before but they might 
ly more developed. 

In what areas? 

In the right way to address the question 
Which question? 


A. The question of medical; expenditures due 
to the alleged misconduct. 

Q. Have you put those additional thoughts 
that you have into any report in this case? 

A. Some of those thoughts are reflected, 

I believe, in both the initial report and the 
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supplemental report in this case, I think both of 
which were written after that deposition. So to 
the extent that those reports are different from 
the reports for Northwest Laborers, I think they 
are slightly more refined; and there are some 


analy s 


pH- 


n there that weren't in the Northwest 


Labor ears Report . And I may have done — And 


I havi 


the r 3 


I wr o 



e some more thinking about the issues in 
way to do that model-building even after 
ose reports for Northwest Laborers. 


Cf. * I take it, then, any additional thoughts 
about the model-building, as you 


descri* 


Sept e 



t, after giving the deposition in 
to the extent to which you had developed 


them ^ ou incorporated them within your November 6 


and s 


Q 


Pq 


mental report in this case. Correct? 

To the extent of my thinking at that 


time, yes. But I have continued to think about it 
afterwards. There are perhaps some thoughts that 
I have had since then that wouldn't be reflected. 

Q. Since the November 6 report in this case? 

A. Yes. 

Q. And since the supplemental report in this 
case that we’ll mark soon but which was submitted 
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sometime in — when was it? — December? 

MR. BXERSTEKER: I think it was 

November 20, to my understanding. 

A. November 20, I think that's right, yes. 


Q. All right. And can you tell us what 


those 




ext en 


to th 


in pr 


|Bj nl 

tllosL 





Let's see- Well, XJtfn preparing an 
ersion of the document that was attached 
st report which provides more detail; and 
ng that document I did some more refined 
bout the issues that were described in 
hment to that initial report. That 


attacnBSI was a paper that I was invited to 


preset the annual meetings of the American 
Statistical Association and was called What Does It 
Mean %^l timate The Causal Effects Of Smoking or a 


Mean T $aBas&fcai t imat 
title IraWst li 


ke that. I'm not sure exactly that 


was the title. And I've continued to work on that 
paper, and some of the work was done after November 
20, and so some of the thoughts in working on that 
paper weren't necessarily reflected in the November 
20 supplemental statement. 

Q. And I'm asking you what they were. 

A. Let's see if I can be more specific. 
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I think that paper ended with a Section 6 which was 
talking about this assumption, this critical sort 
of assumption that relates to relative risks. And 
not only is the writing more refined for the first 
six sections but there are one or two more sections 
that ai||gg||>eing added, one of which deals with how 
to eslLama&e distributions in the real world, the 
actuarWI Id, some comments on that. And then I'll 
have ^^ ier section on how to, quote, estimate, 
unquot§S§fl§^>r specify distributions that are needed 

S tlle™cl>u n t erf actual world dealing with what 

|P«|& 

##P^ld happened in the absence of the alleged 

hzmsd 

miBconOTct. 

qK™»™|What distributions are you referring to? 
A^lwell, the actual-world distributions that 
underl lBMifc he calculation of relative risks of 

lD 

Bmokir^^lhaviors given background characteristics 
of people; the distribution of prevalence of 
,smoking behaviors and the prevalence of other 
health-related behaviors. Another distribution 
that needs to be estimated is the distribution of 
actual expenditures, the pots of dollars in time as 
a function of background characteristics and 
health-related behaviors. 
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Q. And who is working on that with you? 

A. On that paper? 

Q. Yes. 

A. I’m working on it alone. 

Q. Do you have any research assistants or 
qradua ^^ tudents who are running any numbers for 


you ? 


it; i 


f ormu 


a tksA tudents who are 

Prl 

No. So far thi 


paper has no numbers in 



numbe 


or ma 



a thought piece on the right way to 
the problem and the right thing to do. 
t's possible that some version of it will 
me numbers: for example, some of the 

at were in the supplemental report here 
in some of my reports in other states. 


For exampJLe, there are tables in my supplemental 


r epor 
indie 


t show propensity score analyses 
, for instance, how far apart smokers and 


non-smokers appear to be in certain background 
< characteristics. Those may appear in the paper at 
some time, and those tables were not generated by 
me; they were generated under my instruction but 
they were generated by a former student of mine. 

Q. Who was that? 

A. T. E. Raghunathan, who is at the 
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Donald B. Rubin, Ph.D. 

University of Michigan. 

Q. Anybody else do any work on the topic 
you've just been addressing? 

A. With respect to this paper, no. In fact, 
Raghunathan’s work may not even — Raghunathan*s 

it he generated on my instructions may not 
ir in this paper. I'm not sure. Haven't 
ire yet. 

Is anybody paying you for that time? 

a Well, for the time that I spend that 

. is focused on this case, I will bill the 

&fW|| f^^or that time that is focused on the case. 
The QtSSffs time I regard as a sort of, quote, 

schol^S^H contribution and I'm not billing anybody 

|_ ~ 

for t hat. J 

ISHSBr — 

a You said you were invited to give a talk 
paper at the American Statistical 
Association? 

Right. 

Did you give the talk? 

Yes . 

Did you write a paper for the talk? 

That's the paper that is attached, yes. 
And I take it, then, was any of the time 


A. 

Q. 

A. 
Q * 

A. 

Q* 


VO 
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spent in writing that paper billed to the tobacco 
industry law firm? 

A. Certainly the part of the time that was 
spent in developing the ideas in going forward with 
it was billed to them, yes, but the part of the 


time 



1 thought was more of a scholarly 


contribution — for example, the time presenting 


the p 


pr e se 


apIfT] 

w 



Q 


devel 



or discussing the paper after the 
on with people — was not. 

This presentation was made in August of 


Yes. 

Did you tell them that the time in 
the ideas was paid for by the law firm 


for t he tj bacco industry? 


was ? 


Did I announce at the beginning that it 


Q. Yes 


A. I did not announce at the beginning. But 

it was at a session that was on statistics and law, 
and the person who was chair I believe or the 
person who invited the session was a guy named Joe 
Gastwirth, who's at George Washington University, 

I think, in Washington, and he made a statement 
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xs? 

a 


either at the beginning or in the middle or at the 
end that I was involved in litigation on the side 
of the tobacco industry. So there was no — 
Nothing was being hidden at all. And in fact my 
memory is that Joe made a comment that Scott Zeger 


and I 




on opposite sides of the issue in 


Minne!|!ota| I have a fairly clear memory of that 


I mi gl 


re 11 


m ■ 


s wrong, but I think that was made. 

Was Zeger there? 

- • - . .. 'V** 

Was Scott in the audience? It was a 
r audience. I don't think I saw him 

did see him at the meetings. He was at 


the ; I remember bumping into him crossing 


the si 


very 


so at 


But I don't believe he was there. 

These are annual meetings and they’re 
with lots of parallel sessions running, 
same time as mine he may be speaking at 


the same time. 

Q. I just asked you if Scott was there. If 
you know, fine; if you don't know, fine. I don’t 
care how big the meetings were. 

A. Okay. 

Q. How much time did you bill' the tobacco 
industry in developing the concepts that went into 
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this paper? 

A. I don’t have a clear recollection of that 
right now. I could try to look at records I have 
to figure that out. 




Give me your best estimate. 

Just kind of wild guess, but maybe 


50 ho®rs> maybe 60 hours, maybe 40. I don't know. 


Maybe 


# 


m 



s, maybe less. 

At $500 an hour? 


What was it an hour you billed them for? 
At 12 . 

$1200 an hour? 


Yes . 






pMMf 


mar ke 


BY MR 


(Rubin Deposition Exhibits 1 and 2 
identification.) 

HEY : 


Q. Handing you what's marked as Deposition 
Exhibit 1, Professor Rubin, is this a report and 
attachments — they did put your C.V. in there, 
okay -- including your C.V. and the paper presented 
at Dallas that we have been referring to submitted 
on November 6, 1998? 

A. Let me take a look. (Pause) It appears 
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Donald B. Rubin, Ph.D. 

to be. Let me make sure the document is at the 
end. The document is at the end, okay. Yeah, this 
looks like it. 

MR. BIERSTEKER: I would just note 

for the record, Mike, I haven't looked through the 

“ W *K nt ' but there * PP ** r to b * 

mar ki hat were not on the original. 





MR. WITHEYj Sure. 

MR. BIERSTEKER: But that's fine. 


Go 


ahead 





MR. WITHEY: I may have underlined a 



And I don't think I've seen the cover 
letteipbr^ore. That's the first page, but 

QL^^Is Exhibit 2 to this deposition your 
supplreport for the Iron Workers Local 17 
1 itige p^^^ i which attaches a report you submitted in 
the tobacco litigation in the state of Oklahoma and 
a report you submitted on behalf of the defendant 
tobacco industry in the state of Minnesota related 
to the Zeger report? 

A. I didn't remember that the supplemental 
report actually had those others as attachments. 
Were they attachments or were they just included? 
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Donald B. Rubin, Ph.D. 

Q. Well, I don't know what the difference 
between attachments and included means. We 
received them accompanying your supplemental report 
on November 20, I think it was. I'm not sure of 
the dates on that. At any rate, is that your 
repor t. ,«sg£s a .h the attachments or inclusions, your 




suppltfmeYiital report? 

.ifl 

AT] 


Okay, this looks like my report of 
0. And now just let me look to see 
has.... Okay, that looks like the state 
a. Appears to be Oklahoma. And what 
be also attached to it is the 
al report in Minnesota, not the original 


Thank you. 

Now directing your attention to 
ExhibFfF® 8 ® and the paper entitled What Does It Mean 
To Estimate The Causal Effects Of Smoking in 
Exhibit 1, do you see that7 

A. Yes, I do. 

Q. That's the paper that you drafted for and 
presented at the Dallas meeting of the American 
Statistical Society? 

A. Yes, that's correct. 
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Q. And was this paper distributed then at 
that meeting? 

A. No. I don’t believe it was really fully 
written yet so I don't believe it was distributed. 
I mean, it was not distributed. Now I remember. 


X t wa 



Stati 


proce 


m 


jJjJjLj 


Assocf 


submi 



held 


that 


£3 


distributed. 

Has it been distributed since then? 

It 1 s been sent in to the American 
1 Association for inclusion in its 
s and it has been distributed to anybody 
for it. 

So I take it the American Statistical 
n will publish the papers that were 
in the report of proceedings that are 
eir national conference? 

Yes. They typically publish the papers 
invited by particular sections, so the 


publication is by a section of the American 
-Statistical Association. So I think this was 
invited by the section on statistics and 
epidemiology. 

Q . And that would be so that if someone 
happened not to have attended the session because 
there’s a lot of other sessions to attend but 
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wanted to find out who wrote what or who talked 
about what, they could find that out by getting 
this report of proceedings? 

A. Correct. 

Q. It is not a peer-review process, however, 

is it 

No, it's not. 

Have you submitted this paper for peer 

td Not yet- 

imSiM 

Are you intending to do so? 

Yes . 

To whom do you expect to submit it? 

Well, I probably eventually will submit 
it to ^Stalistics In Medicine, which is a peer- 
revie\|^$8ta|j ournal, because there's another place I'm 
suppoWFto be presenting this work and they want 
the papers to be submitted to Statistics In 
Medicine for peer review. 

Q. Where are you going to present? 

A. CDC, Centers for Disease Control. 

Q. When? 

A. Friday, day after tomorrow. 

Q. Down in Atlanta? 



ft..- 
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A. Yes. I was invited to make a 


presentation. 


Q. Have you sent them a copy of this paper? 

A. Pardon? 

Q. Have you sent them a copy of this paper? 


£>•( 


I don't remember. It's possible, because 


if sonpbdiy there requested a copy I certainly did. 
There *areja variety of people that have requested 


copie 


many 


I’ve sent them. One person requested 
s be made. 


d Who was that? 

• &|llltsll| I believe it 


was Joe Newhouse. 



y^TjWho is Joe Newhouse? 

ipHHe ' s at Harvard University, an economist 
who ’ s H arvard University in the medical school, 


the S 


copies? 


of Public Health, Kennedy School. 

Do you know why he asked for numerous 


A. I think he was going to use them for a 
seminar or class he was teaching. I'm not sure, 
though. That’s a guess. But also Joe was involved 
in the Massachusetts litigation. 

Q. On behalf of whom? 

A. Massachusetts. 
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T3 
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Q. Where is the location of the CDC 
conference you're going to present this paper at? 

A. Atlanta, wherever their offices are. 

I remember the hotel. It’s a Sheraton hotel where 
I'm supposed to stay but I think just changed its 
name tlpt^gdimething else now. I don't know what it 


name t^^methin 

S Who at 
God, t 


the CDC invited you? 


up. I 



|God, there's a variety of names that come 
Ink one name is John Odenkrantz. There 


are a Suple other names in the e-mails that I've 
enfpMPijcause there are people who are helping 
organi^Tfit. But I think John Odenkrantz. 

QfP*™"^Who is Odenkrantz with? The CDC? 




Yes. 


what d 



|Do you know what section of the CDC or 
lion? 

No, but I can certainly find out for you 


later. In fact, I could probably find out for you 
this afternoon by calling my office and asking them 
to look at the e-mails. I think I could probably 
get the other names of people who have been 
involved as well. 

Q. That would be great. If when we take a 
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msm/md, 

XU 

Gi 

F*®x% 


lunch break you could provide that to us, that 
would be great. 

A . Sure. 

Q. But to your knowledge was this paper then 
included in the report of proceedings from the 




12 

^ iTOTfliC 


the i 

4 * 

firm 


Dallas tomfe tinq of the American Statistical 

FV 

Ass oc ape t ±®n ? 

LXJ ■*«* 

I don't think that has appeared yet. 

But you're expecting it to? 

fl |i■|Yes. I believe it will. 

CrP*™l And in the paper does it mention that 

am, 

pw® 

c§ii!si%e work done in conceptualizing the paper, 
the i cflElnsH in the paper, was paid for by the law 
firm j ^— ihe tobacco industry? 

A._Jl don't think it does. 

a Do you think it should have? 

I'm almost sure it does not. Do I think 
it should have? 

Q . Yes. 

A. I don't believe so, I guess, because it 
is really just a scientific document that makes no 
statement one way or another about who's sort of, 
quote, right or wrong or what dollar amounts are 
involved. It is a statement of how to do the — 
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12 

XI 


msm$& 

Q 

cu 


how to ask the question statistically. I would 
write the same thing no matter who supported the 
time to write it. 

Q. Other than some additional I guess 
refinement of the analysis as presented in this 


paper 


we ha 


have 


f urt h 


expen 





.£* 


as given in Dallas that you've described, 
en informed by Mr. Biersteker that you 
taken another effort to try to make a 
alysis of the NMES, the national medical 
e survey, data. Is that correct? 

To clarify, I've taken on an effort to 
rect for the missing data in NMES so that 


furth^3ialyses can take place that have some 


statii 


doing 


il validity. 

When were you first asked to do this? 
Well, let's see. We've talked about 
>n and off for a while. "A while" means 


several months. And I gave some advice in terms of 
names of people who could do it, this multiple 
imputation for the missing data, addressing the 
missing-data problem in NMES, to Peter Biersteker. 
And I took on the task myself probably about three 
or four weeks ago and tried to organize the project 
starting then. The three or four weeks may 
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actually be two weeks, three weeks, two and a half. 
Something like that. Maybe three. My guess is 
three. Although, as X said, we talked about the 
desirability of doing so a while ago and I made 
some suggestions for somebody who could actually do 


it pr 




mxne , 


ma j or 


B ■ 


y four months ago, three months ago. 

Who did you suggest? 

I suggested another former student of 
Schafer, with whom I * ve done a couple of 
iple imputation for missing data projects 
ideral government; for example, for the 
‘enter for Health Statistics for NHANES. 


And tlKSrganization, just to be clear, was NCHS, 


Natioj 


lenter for Health Statistics. 
How do you spell Schafer? 


S-c — h-a-f-e-r. 


And where is he? 


A. He's at Penn State University, department 
-of statistics. 

Q. Has Dr. Schafer, Professor Schafer, 
whatever, done any work on this? 

A. Professor Schafer has done some initial 
work on it. 

Q. When was he first asked to do that work? 
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iw 


_ 12 

OA 

U 


Wb 


A. I don't know exactly because I think 
Peter Biersteker made the request. But my guess is 
it's about three months ago. 

Q. Did Mr. Biersteker give you any time line 
for when the work had to be completed by for the 


shoul 


depos 


Exj 

'“pMISsf 


amoun 


purpo^^if the Ohio litigation? 

^ ^ MR. BIERSTEKER: Object to the form. 

Well, I think it is always clear that it 

shouli^^re been done ideally prior to this 

deposj^^i, probably a week prior to this 

S osigBi. I don’t remember exactly. But I think 

’ i^Plt a^H§fe£S| the intent was there. There was a short 

amounr^r time available to do it. 

You have been deposed before. Is that 

right * Prj fessor? 

’pMF 

a Yes, I have. 

And you understand that at least part of 
the purpose of a deposition is so we can know what 
your testimony is in the upcoming trial, which is 
going to start in less than a month? 

A. Yes, I understand that. 

Q. And you understand that that allows us to 
prepare for the trial and allows us to prepare, if 
you are called, to cross-examine you at trial? 
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Donald B. Rubin, Ph.D. — 

A. Yes, I understand that. 

Q. And you understand that if you come up 
with a new report or new information or new 
supplemental analysis using an MI or multiple 
imputation analysis of data that we haven’t seen 
before^What we have a right to ask the judge not 
testify about it. Has that been 
to you? 

Well, in a general way. I'm certainly 
'er, but I generally understand the idea 
iducing things at the very end doesn't 
of an opportunity for you to ask 

So I understand that in a layman's way. 
And you have no problem that that’s a 

F "— 

basic t fair ness idea? 

a I understand the logic for doing it as a 
Id, if that's what the question was. 

Q. That’s all I'm asking you. As a layman 
you understand the logic and it has something to do 
with fairness. Right? You would agree with that? 

A. I certainly know when I was looking at 
supplemental reports from Harris, on and on again, 

I thought "Oh, my God, another one of these 1" It 
kind of puts me at a disadvantage. And 
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I understand that's what you're saying. It works 
both ways. I understand that. 

Q. When did you first realize there was a 
missing-data problem with the national medical 
expenditure survey? 




When I first got involved in any of this 


litiga&idqn, which I guess goes back to the first 


Mi s si 


' mm i 


A yea 


Q i 



and a 


t) 

*T ifc 


I was involved in, which must be I think 
ii, or maybe Florida. 

And that would have been in 1996, '95-96? 

Wait. How long have I been involved in 
may have been ’97. 

All right. A couple of years then? 
a half, two years? 

Yeah, something like that; a year, year 


And in fact in your report that you 


submitted at least in Oklahoma you drew attention 
to the missing-data problem in NMES. Correct? 

A. Absolutely. 

Q. And the same with Minnesota, you in fact 
testified about it? 

A. Absolutely. 

Q. And you did that in Washington, the 
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Northwest Laborers case? I asked you some 
questions on page 74 of that deposition about 
missing-data issues. Correct? If you recall. 

A. I don't have an explicit recollection of 
that, but I do not doubt that you asked me 


quest! 


few 


about it and that they're on 74, if you 


Cn 

M 

Cn 

CTj 


so rejwe^nt. 


(frMstarting a*t 74, anyway. Okay. 
qjj Knowing that there was a missing-data 

proble|B|®i| Rather than being unclear. Professor, 

mil you mean by the missing-data issue as it 
NMES. Tell us what that means. 

As in almost every survey, all the data 


IsO 

...... 

'* - '-<5^ 


e 
a t< 



that tp?e™4urvey takers tried to collect were not 

colle cted] One of the major problems in NMES for 

pmniF 

this gjgm^^al case, talking about all the tobacco 

EE 

1 i t i g aP?*^w! cases, is that expenditure data is often 
missing. There are other kinds of missing data as 
well on some background variables like whether 
people are married, whether they use seat belts. 
Smoking behavior is sometimes missing. 

So there are a variety of important 
variables or factors that are often missing in 
NMES, and in the various states there were a 
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variety of ad hoc and inappropriate methods being 
used in the plaintiffs' analyses to try to address 
the missing-data problems. 

Q. Your critique of the use of NMES went 
beyond just the missing-data problem; there were 
some Qt&Mg: problems you've identified in various 




report*. "%Fair enough? 

To be clear, I think when I identified 
probl ^^j hey were mostly identifying problems with 
the we j pm&^e plaintiffs' experts had done analyses 
IfTuh NWpirf not accounting for the complications in 
;• database. I don't remember ever having 

Dust c^BIcized the NMES dataset as being hopeless 
in an^e^ise. But it had complications. It had 
comp!i|Mtl ons with missing data, large amounts of 
missir^m^ita. It had complications in the sense 
that not a simple random sample? it had a 

weighting structure to it. So that correct 
analyses, valid analyses, had to take account of 
these complications in NMES. 

Q. I've asked you this question in the 
Washington Northwest Laborers case and you gave an 
answer about NMES and I didn't recall that you made 
any corrections to that answer. You would stand by 


kO 

u> 

O', 
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what you said about NMES back then. Fair enough? 

MR. BIERSTEKER: Xf you want to show 

it to him and ask him, fine. 

A. I would be happy to look at what 1 said 
about NMES. I don’t think I would have anything to 
ith. But to be absolutely sure.... You 
know, psoiffletime s sentences can be taken out of 
conte>re“Sd read different ways. To be absolutely 
sure, ^ft apB I should take a look at it; and 
I woult Htb pleased to do so. 

Let ' s go back to the multiple imputation 
eue.pi^take it that in identifying the missing- 
data problem with NMES and in submitting reports. 



you haff™»4ine discussions with lawyers for the 

tobacc^^dustry about your opinions in that 

F 888888838539 '' 

regarciiigpgi'air enough? 
yes. 

Q. Was that always Mr. Biersteker or were 
.there other lawyers that were involved? 

A. There were other lawyers involved 
sometimes. 

Q. Can you name the lawyers that you have 
met with and discussed NMES generally with? 

A. My memory is at the beginning Tom Selfin 
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was involved. 

Q. And if you happen to know the law firms 
or who they represented, please add that to your 
answer. 

Okay. 1 probably don't. 

More recently Barbara Harding. In 
lark Hall. What I'm trying to do is first 
names of lawyers with whom I spoke; and 
talked exactly about NMES or not I'll 
link harder about. Certainly talked about 
lissing-data problems with Tom Selfin and 
iteker and Barbara Harding. I believe 
Morrie Leiter I believe talked about 
ilems; he was in Washington, I believe, 
lother name I'm blocking on in Oklahoma 
and T|||gg$§§£|, Tom —— Is Peter allowed to help me on 
this?r®llfi, boy I I can see his face, I can hear his 
voice, but I can't think of his last name. 

Q. All right. 

A. Oh, there's another fellow from the early 

days I haven’t spoken to probably for a year and a 
quarter who was before the first deposition that 
Peter would also have to help me with. He was in 
Minnesota as well. But I don’t remember explicitly 
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whether some of these other folks talked about NMES 
missing data but presumably we did. 

Q. At any time in any of the discussions 
with any of those defense lawyers you just named or 
any other tobacco industry lawyers that you've 
di scu NMES with, did you talk to them about 

"Well we do a multiple imputation analysis of 

this iUHlng data, then we might be able to produce 
statife^illy valid results from NMES"? 

1 * m sure there were conversations to that 

lec' 


LXJ 

And I assume, though, that at the time 
ubmitting reports in Oklahoma, in 
Mississippi, wherever it was, that you 
under |to|i -- I think you've already testified — 
that tlfrM p sj& laintif f s 1 experts in those cases 
inclutfBffl Professor — what's his name? — 

Harrison, Zeger, others were utilizing NMES to 
support the claim for Medicaid expenditures in 
those cases? 

A. Correct. I think I did. 

Q. And you understood that if you developed 
a way of multiple imputation analysis that could 
then be used to validly estimate differences in 
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Donald B. Rubin, Ph.D. 

medical expenditures between smokers and non- 
smokers, that that multiple imputation analysis lti 

might be used by statisticians on behalf of the 
plaintiffs. Fair enough? 

A. Yes. And just to be clear, the multiple 
adflresses just the missing-data problem; 
address other problems in their 
So just with that clarification, yes. 
[Right. And is it fair to assume that if 
fen given more time — Let me ask you 
gfiij s iipiher words. If you had been asked in 1997 
fing question: Professor Rubin, could you 

ir us a multiple imputation analysis of 
fg-data problem with NMES in order to see 
yn^be validly and reliably utilized in a 
statis s^lly appropriate way, you could have done 
that Fair enough? 

A. I believe so. Although just one point of 
clarification is that over this year and a half, as 
with lots of things in computing, things you rely 
on in computing, there have been improvements in 
not only hardware speed and storage and locations 
but there has also been a better understanding of 
some of the algorithms that underlie this. It's a 
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developing area. So it *s not as if had we started 

a year and a half ago we'd be exactly where we are 

now, starting now; in the same sense as the 
computer you have now that, who knows what you paid 

for it, you know, $1500, $2,000, that three years 


ago o|^p^) years ago would have cost twice that. 
So it rapt quite like it's all a waste of time. 

F T* T 


Thing^^fWye improved since then. 

Did you ask any of the defense lawyers 
n working with for the tobacco industry 
ey wanted you to do this multiple 
analysis back when you first identified 
g-data problem with the NMES data that 
iffs’ experts were using? 

I don't remember if it came up as my 
m whether they wanted to. There was 
discussion about that. 

When were you told "Yes, we want you to 



a s kin 
cert a 

Q. 



CD 


kO 

Ln 

CT> 




n: 


. do that "? 

A. Well, I wasn't told yes, they want me to 
do it until a few weeks ago. 

Q. Until a few weeks ago? 

A. But the discussion that yes, we should do 
it, who should we have do it, that was maybe four 
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months ago, five months ago. 

Q. Four or five months ago what was said 
about doing the multiple imputation analysis of the 
NMES missing data? 


A. I think it was to the effect that we 


reall 




correi 


uld try to do this right, these analyses 
A component of that is dealing with the 


missii^^ta problem in NMES; we should try to take 


care 


ri 

L_1_ i 


And who articulated that? 

X believe Peter did, Peter Biersteker 
And what was your response? 


A^T/My response was "Great." I've done it 


before 


organ 


gover 


Joe S 




other — I've done it in the sense of 
\ people to do it before in other 

surveys and I think we can capitalize c 
ir's experience in having done that in a 


couple of government surveys. 

Q. So what happened between four and five 
months ago versus two or three weeks ago in terms 
of this project then? Nobody said "Yes, you're 
authorized to do it, go ahead and do it," or what? 

A. My understanding is that Peter got Joe 
Schafer involved and Joe kind of ran out of time to 
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do it. 



w 

r-k 





gj» 

fRBKBm 

12 

0 * 



a 


Q. When did he get Joe Schafer involved? 

A. I believe about three or four months ago; 

maybe three months ago. 

To be clear on the size of the 
pro j ecbt^Mjkhen Joe Schafer and I and a sort of team 


of other people which included colleagues of mine 


and p 
imput 


ffupxe 


that 


Bj 


were 


it mu 


f ive 


discu 



at NCHS went on the project to multiply 
NES, that was a very long project and 
ith the support of a variety of people at 
elves, and that was probably a project 
four years, maybe three years. So we 
g to piggyback on that experience to do 
re quickly. 

At the time three or four months ago or 
s ago, whenever it was that you had this 
with Mr. Biersteker about doing the 


multiple imputation on the NMES data, you had been 
identified or at least you had been talked to as an 
expert in this Ohio litigation. 

A. Yes. I invented it so I guess I'm an 
expert in it, so.... 

Q. Right. But the point is you knew there 
might be some deadlines in terms of reports in this 
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ironworkers case. Correct? 

A. Yes, I was aware that there would be 
deadlines. 

Q. Did you ever discuss with Mr. Biersteker 
what those deadlines might be prior to two or three 


i t ^ 

weeks a 


yes , 
try in 



^ <fmld 1 


. FS , 


6 . I 



As I think I mentioned a few minutes ago, 
were discussions about deadlines and 
have work completed before depositions. 
But you submitted a report. Exhibit 1, on 
. You knew there was a November 6 
or your report. Correct? 

Yes, at some point I knew it was November 
t remember how many days before I knew it 


was Navemaer 6. 


repor 


Then you also submitted this supplemental 
you knew there was a deadline for this 


supplemental report, whatever it was. 

A. Yes. 

Q. What was it again? November 20? 

A. Yes. At some point, maybe a week before, 

maybe two weeks before, I knew that there was a 
deadline like that. I've also had experience with 
deadlines shifting. 
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Q. Did either Mr. Biersteker or Ms. Harding 
or any of the defense lawyers in this case say "It 
doesn't matter when you submit your report, get it 
to us whenever you want"? 

A. Absolutely not. 

cause you were an expert in the 
Washinotoa litigation, you were aware at least as 


of somp 


plaint 
and to 
ilHbre 


04 


trust 


Correc 


of the 


state 



|e after June 5, 1998, which is when the 

I 

damage model was presented to the court 
| defendants in the Washington Northwest 
fase, you were aware that the damage 
|done by the plaintiffs’ expert in the 
I cases were not relying upon NMES. 


MR. BIERSTEKER: Object to the form 
tion. I think it mischaracterizes the 
e record. 


A. I think I understand it, but could you 
rephrase it just to make sure? 

Q. Of course. You've read some of the 
damage model expert reports by Dr. John Dement, 

Stan Roberts, Weintraub, Dr. Flanders — 

A. Lauer. 

Q. Lauer eventually, yes, although he was in 
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3 7 



Ohio; now 
identified 
A. 

Q- 

A. 

comme 



I'm talking about the experts that 
in Washington. 

Okay. 

You read their reports? 

Yes . 

And you've testified about some of 
n those reports. Correct? 

Correct. 


were 


your 



those 


r-epi 1 


And you were provided those reports by 
unsel in order for you to make an 
of the reports from a statistical 
. Correct? 

Correct. 

And you thus knew at the time you read 
rts that the reports did not refer to 


NMES ? 


LJLjI 


Correct. 


Q. And that they were relying upon a 
. somewhat different model for estimating a disease- 
specific cost for trust funds that is attributable 
to smoking, at least in those reports? 

A. Correct; although cost is sort of odd 
because they didn't do analyses with dollars, 
really, in their models. 
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Q. But you understand the basic point then 
that they were not using the national medical 
expenditure survey in deriving their damage 
estimates. Correct? 

A. Correct. Well, let me.... Dr. Harris 


was r 


ef^lrtin 


ing to some studies that did rely on 


believer 


NMES . r-T Pit’s not entirely correct, I don't 


Was Dr. Harris identified in the 
case? 

Wo, I was talking about this one. I'm 
u’re only talking about Washington? 


425 


Washi 




I apol 



Washi 


v 

♦Hiu ,1'ju 


All right. 

And also I got slightly confused thinking 
State. I’m sorry. I apologize. 

Is it fair to assume that if you or the 


other defense experts had wanted to use NMES data 
in a valid way to offer opinions regarding the 
differences or non-differences between smokers and 
non-smokers in terms of medical expenditures, you 
could have done that multiple imputation analysis, 
if asked early enough, in order to be able to 
submit a report by November of 1998? 
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A- Could you rephrase that? Or maybe 
X can — 

Q• Sure. What I’m asking is: If you had 

been given the deadline of November 6, 1998, and 

were asked early enough we'd like you and perhaps 
Profes ^gd Schafer to develop a multiple imputation 
analy»is' l ®f the missing-data problem in NMES and 
preseiffnEjiat in an expert report, is it fair to 
assumi^i could have got it done by November € had 
you b^Mp^sked earlier? 

If I had been asked to be involved 
f IPfffl i i d had realized that Dr. Schafer's 

schedirSj^jwould have prohibited him from doing it in 
a tim^v™4?ay, I would have put other people 


Which, I have been going to try to do it 

i n s t e e fed Mg gH i 

And you could have done it by November 6? 
A. Yes, if I was sort of put in charge at 
'that time and actually had the ^foresight to realize 
the complications of certain people's schedules. 

Q. Were you the only defense expert who 
offered criticisms of the NMES study or the NMES 
data in any of the state AG cases? 

A. You mean of the use of the NMES data? 
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Q. Not. only the use of it but how NMES was 
conducted and run and how the data was derived and 
how the surveys were responded to; you know, those 
kinds of things. 

A. I believe there were other people who 
were .cal certainly of the analyses. I'm 



tryingUjtos distinguish between the analyses and the 
databailFltself as potential for supporting 
analyf^ ^| 

Let ' s use both just for now. We'll 
Ibal^y^Sivide it up later. 

Okay. 

And who are those people? 

Who were critical of the analyses? 
Either. Either critical of the way — 


<0 


Who or testified on behalf of the 

_ i 

defen£fn|out NMES generally. 

A. As long as my answer doesn't imply that 
< they were critical of the NMES database as a 
database. I don’t think these people were. 

Q. All right. Just tell me which people — 
A. Wecker; Brian McCall. They both 
testified. What other reports have I read that 
were critical? I think those are the two primary 
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ijfiano: 

7 Vj 


other ones whose reports I’ve read. There are 
other people, other defense experts whom I’ve met 
with once. 

Q. Who are those? 

A. I'm trying to remember their names. 

There woman whose name I'm blocking on who 

workec|*~f C*r the agency that produced NMES , is now at 
the Urrrversity of Michigan. 

And where was she a witness? 

a I don't think.... Well, I think it was 
ma . 

And she is at the University of Michigan? 
Yes, at the ISR, Institute for Survey 

Resea] 

[> —™i^ 

|You don't remember her name? 

[Nancy Mathiowetz. I don't know how to 

spe 11 

Q. Nancy Mathiowet 2 ? 

A. Mathiowetz. I think that’s the name. 

But I could be wrong. I’m not — 

Q. What did she have to say about NMES, if 
you recall? 

A. My memory is that some of the points were 
the same as mine, which is that there is a missing** 



ftgiwr 
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“pH 




data problem. The survey itself had a complex 
structure; it was not a simple random sample. Both 
of which are less criticisms of NMES per se than 
the way it has to be analyzed carefully. Did she 
have any other opinions? I’m not — I don’t 
rewemb eM I don't recall. 

You met with her then? You know her? 

No, I've never met her in person. 

Just by phone or something? 

Exactly. 

But you've read her report? 

jdaslii 

ri , 

1 don't believe I've ever read a report, 
r of fact. 

But you have at least had a discussion 
with J bout what her opinions were, what her 

ideas toward NMES? 

Right. Because I think she worked for 
the agency that produced NMES and so knew a lot 
.about it. 

Q. She worked for the agency that — 

A. The agency that produced NMES, the Agency 

for Health Care Policy, That's the name of the 
agency that produces NMES; and before she was at 
University of Michigan she was, I believe, a 


as a 
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questions was, where you could find out the actual 
exact form of the question, what the databases 
might look like and that sort of stuff. 

Q. That was pretty useful then? 

A. Less so for me than for people actually 
doing analyses of NMES who had to actually 


worry sabdat the database structure. 


r epor 


g . 


inf or 


X cer 


inf or 



Do you know if she actually submitted a 


I don't know. 

Did you ever ask for a copy of it? 

X never asked for a copy, no. 

You found her to be a useful source of 
n then. Correct? 

She was knowledgeable about NMES in ways 
y am not, so she was a useful source of 


You don't happen to know how to get ahold 


.of her, do you? 

A. I imagine if you put in a phone call to 
the Institute for Survey Research -- 

Q. Where is that? At Ann Arbor? 

A. Yes, at Ann Arbor. -- and ask for a name 

that probably sounds like that, you'd find her. 
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Tom Fennell. I just remembered the name of the 
other lawyer in Oklahoma. Okay? I was thinking 
about.... Another way to do it, you could try to 
contact the lawyer for the defense in the Oklahoma 
case, the Oklahoma AG case, and ask them to get in 


touch 


pr epa 
the p 
last 

|Bjle 





the c 


You were aware at the time you were 
reports in Oklahoma and Minnesota that 
iffs' experts included two people whose 
are Miller, Leonard Miller and Vince 


Yes . 


And you've reviewed their reports? 

Not for a while, but I reviewed them in 
t of those cases, yes. 

At the time? 


Yes • 

Q. And those reports concluded or gave the 
opinion that smokers had greater medical costs than 
non-smokers, in the Medicaid population at least, 
using the NMES data. Correct? 

A. That was their conclusion, yes. 

Q. Do you recall generally by what 
percentage the smokers' medical costs exceeded the 
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non-smokers' in those experts' opinions? 

A. No, X don’t, because I guess I regarded 
the analyses as being so flawed that I never sort 
of stashed away the numbers that came out. 


BY MR 


*T5 


Km* 


{In recess 10:20 a.m. to 10:22 a.m.) 


HEY : 


r epor 


this 






I hea 




Have you read any of the defense expert 
!l£ it relates to damage model or NMES in 
<4flH^a-4 other than, obviously, your own? 
itapiAs it relates to NMES? 

ry 

<#?®®^Yes. Either Wecker’s, McCall’s -- 

fa 

i|M»foh, defense. I’m sorry. I misheard you. 
rBolaintiffs . 

— i That' s okay . 

aTI As relates to this case? I don't think 


a 


b . 

Pq 


You have not read either William Wecker’s 


or Brian McCall’s report? 

A. Not for Ohio I don’t believe. 

Q. How about for Washington, the Northwest 
Laborers case? 

A. The Northwest case? I think I read or at 
least glanced through parts of it right before the 
phone deposition that you took. And I’d have to 


JONES FRITZ & SHEEHAN 


http://legacy.library.ucsf.e^ticbipot|6^a0jQ)ipd^.industrydocuments.ucsf.edu/docs/xygl0001 


51956 9386 




12 



4 7 


Donald B. Rubin, Ph.D. 


look at that. If we could look at that deposition 
I could tell you, because I think you asked me 
about that then and I was replying that I had just 
seen a couple documents. But I haven't looked at 
them since then. 

Do you have the understanding that both 
Brian all and William Wecker are now utilizing 

the Nt atabase in support of their opinions 

about ges in this case? 

a I have the general understanding that 
ursuing the NMES database in their 
# |jlj||^g|Eor the defense. 

And have you had any discussions with 




t h o s © individuals about your opinions on the 

NMES database, et cetera? 

Pf j pii r 

m Back in Minnesota days I had some brief 

s with Bill Wecker. I don’t believe I’ve 
ever had a discussion with Brian McCall other than, 
you know, a cordial hello when we met in Minnesota. 
Since then I've probably had a couple, maybe three 
phone conversations with Wecker. 

Q. Tell me what you said when you talked to 
him around the Minnesota time and what the subject 
matter of the conversations was on the telephone 
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with Mr. Wecker. 

A. Okay. Around Minnesota time, this was at 
the time of the trial and they were quite brief, 
really. And I’m trying to remember what they were. 
I don’t have a specific recollection of the content 
° £ th >H> versations at the Minnesota trial. 

^ How about the telephone calls? 

The telephone calls since then. 



I remd jal & ffhf two general ones. I believe this is 
corre^H There ’ s one that I certainly do remember 
Ich nraff a discussion of this thought piece, a 
y version of this article that turned 
into I^Pproceedings article in which I wanted his, 
his firm's, comments on whether it was 

readablej understandable, and just to make sure 

p mw 

that s making contact with statisticians who 

TTT 

might Wtmt to use that as a sort of template to do 
analyses. 

Then I believe there was another 
phone conversation I think I had with Wecker, 
although it may have not been with Wecker, it may 
have been -- I think it was -- dealing with 
propensity score analyses that could be done with 
the NMES database in particular, I guess. But 
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^"T5 


’TS, 



o 


generally propensity score analyses do help 
document and adjust for the differences between 
smokers and non-smokers and former smokers and 
non-smokers in doing these analyses for getting 
relative risks. 

Relating to the NMES database? 

M Well, I think the general conversation 
er to any specific database because the 


gener< 
might 



wit 


discu 


way i 


from 


remember. 


laments referred to any analysis that one 
4 So it may have been in Wecker's mind it 
IF^d on the NMES database. I don't remember 


in our conversation. 

Whether it was a focus or not, would your 
of propensity scoring have included the 
ht be applied to some of the data derived 


Implicitly, sure. Explicitly, I don't 


Q. That's all I’m asking. 

A. Okay. Certainly I -- Just let me 
clarify. When I'm having that conversation, 
certainly I would have had in mind that whatever 
I was describing and recommending might well be 
applied to NMES. 
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Q. That was really my question. 

A. Okay, fine. 

Q. And did you have a discussion with him 
about multiple imputation for the missing-data 
problem with NMES? 

I don’t remember but I would not be 
surprifsed^ if we did ^because that would have been 


my rei 


data 


indation for how to handle the missing- 
.ems before going on and doing these other 


analysliH 

inni hdbd 

C l 


Did you have a discussion with either him 
haven't talked to McCall, then, I take 



r-’—'v 


knowl 


that 


No, I have not. 

All right. —— with him or to your 
any other defense expert about the idea 
has this missing-data problem, it is a 


problem, I have what might be a solution with the 
multiple imputation work that I’ve done? Did you 
have that conversation? 

A. I mean, I certainly must have talked to 
Bill Wecker or people from Wecker Associates about 
that and how to do that, and I have had 
conversations more recently with them about how to 
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actually do part of it and how to analyze a 
multiply imputed dataset. To be clear -- Yeah, 

I just remembered something else. At one point 
I did send Wecker a list of references on 
propensity scoring methods, a list and maybe even 
some :s of articles that he may not have had 

as easily as I did. And I probably also 
list of references and maybe even some 
if articles on multiple imputation, 

the same time, perhaps at another time, 
ir e . 

Did you understand from anybody, could be 
liersteker, that Wecker and McCall had 
reports that had relied upon NMES data 
iving done a multiple imputation analysis 
fssing data? 

MR. BIERSTEKER: Objection, asked and 

answered. 

A. Did I know that they had submitted 
reports based on NMES? 

Q. Yes. 

A. Yes. 

Q. Well, not only on NMES but — 

A. Oh, without multiple imputation. Okay. 
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r j 

in 


^"“5 



'O, 

o, 

S w r J Lj j 


Q. Strike the question. Did you understand 
they had used the NMES database with its 
limitations of not having -- Strike that. 

Did you understand that Wecker and 
McCall were utilizing the NMES database including 


the f 




hat it had missing data in deriving their 


opinions “'in this case? 


of th 






u s mg 


ver si 


So th 


MR. BIERSTEKER: Object to the form 

stion. 

I think I understand the question, 
don't think it was asked properly. So 
of reformulate it? 

Sure, go ahead. 

I understood that Wecker and McCall were 
NMES database with its, quote, flawed 
having missing data and going forward, 
uld say, okay, we'll accept the 


plaintiffs' analyses, how they handled missing 
data, and now we'll show that it still has 
problems, but sort of in a style saying these 
analyses don't say what they do with missing data 
is correct but let's accept it and show other flaws 
in the plaintiffs' analyses. I was aware that 
those kinds of analyses were being conducted. 
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T3 7 


Q. Did you caution them, either of them, 
about using the NMES database with the missing-data 
problem that you had identified without at least 
having some multiple imputation analysis done? 

A. I certainly — I think everyone was 


>% 

pQ 1 




aware 


analy 

probl 


not b 


analyhey were doing often was to show, even 

or even assuming what the plaintiffs did, 
® WWkt ffilildata was valid; the plaintiffs' analvses 





the inferences that come out of any 
hat doesn’t address the missing-data 
some valid way, that those analyses will 
id. But I think that the purpose of the 


were 


Wecke 


cases 


m Wa 





lata was valid; the plaintiffs' analyses 
So it was sort of this -- 
Well, you understand that in these casei 
l McCall, I'm not talking about the AG 


, I’m talking about the trust fund cases 
gton and Ohio at least where they have 


been designated and have given reports, that they 
were going beyond -- well, that these experts, 
Wecker and McCall, weren't criticizing Dement- 
Roberts for utilizing the NMES database, because 
they didn't use it, but rather were themselves 
asserting by our analysis of the NMES database we 
believe there are no differences or minimal 
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H 




pis 

12 

SUI 



differences between smokers and non-smokers in this 
trust fund population. Is that your understanding? 

MR. BIERSTEKER: Let me just 

interpose an objection. 

MR. WITHEY: To the form, okay. 

■ MR. BIERSTEKER: To the form of the 

questi|on ^nd to the extent it mischaracteri zes the 
record?] 

A|^|gidTo the extent I have only seen what they 
did inPipi^ Ohio case and have only briefly seen 

tRpf’did in the Northwest, I don’t know that 
^rtan that I really know what you stated. 

I meanX^do know that there is interest in trying 
to useFwMlIS to get, quote, the right answer, 
unquotpy^^o do something that's valid, 
statisfetoilly valid using NMES as a core database. 


Q^^^Let me ask it this way in other words. 

If anybody on either side of this litigation, 
-whether it be a plaintiffs* expert or a defense 
expert, said "We’re going to use the NMES database, 
we understand there's a missing-data problem but 
we're going to go ahead and generate data and 
conclusions based upon those data without using a 
multiple imputation analysis," you would say "I’m 
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sorry, you really shouldn't: do that because it’s 
not valid." Pair enough? 

A. Almost. Because there are other ways 
that one could try to go after the missing-data 
problem that do have some validity in certain 
cases, think they are much more difficult than 


T.i 


mu 1 tiu*l computation. And to be clear, one of the 

jLxj 

issuernlr! that often when you do a valid analysis 



, whether it's multiple imputation or 
weigh1p^» methods or maximum likelihood, the 

IndaWr^rrors always get bigger; the confidence 
ier-v^^Li^ always get broader. In some cases the 
point WRiaates don't change all that much. 

So. . . ^6ft in general what you're saying, with 
certain clveats, is correct. 

In other words, there would be much wider 
conf l intervals in the data that was derived? 

A. That's right, because the invalid ways of 
handling missing data, even if they're sort of 
correct in sort of the point-estimation phase, 
pretend that they know something with certainty. 

And how do you know something with certainty if 
you're making it up and putting it in? You don't. 
And if you act as if you do, you're going to 
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underestimate uncertainty. 

Q. What are the other potentially valid 
analyses of the missing-data problem, if you can 
identify them, other than multiple imputation? 

A. In this particular kind of complex 
s ituat i’hMgd I don't think the other methods work very 
we 11 • 

Okay. 

i CZELi In principle, if you have a small amount 
of mi^i^ data and it's mostly something called 
Ifflt jUfflsponsB, it's one person who refuses to 
^iMwei^^ilighting methods can work very well. If 
you ha^f^ big model, really fully specified, you 



can dcf ^r na^imum likelihood if samples are large or a 
big model that sort of models everything 

at the feprad ie time. And that can be valid. 

Q^^But those aren't useful when you have 
such a complex set of data, correct, like NMES? 

A. I don't believe those could be 
successfully employed with NMES. 

Q. And you wouldn't recommend doing those 
because you have the multiple imputation analysis 
yourself which you think could be used. Correct? 

A. Correct. 
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MR. BIERSTEKER: With respect to 


NMES 


MR. WITHEY: Yes. That's what I'm 


talking about. 


A. With respect to NMES, yes. 



or Dr 


To your knowledge, has either Mr. Wecker 
'all utilized any of these other methods 


for correcting for the missing-data problem in 


NMES? 




Mr. Bi 





what c 


I don't believe so. 

And I assume you've expressed this 
iat you've just reached to 
;eker. Correct? 

Yes . 

And you've expressed it to Mr. Wecker in 
:onversations you had with him. Correct? 


Correct. Well, just to be clear, in 


whatever conversations meaning in some 
conversations at least but not all. 

Q. Fair enough. Not every one but in some 
one or more of the conversations with Mr. Wecker? 

A. Correct. 

Q. What was his response to that, 

Mr. Wecker's ? 
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A. Mr. Wecker’s response? I think initially 
it was sort of cautious. Mr. Wecker comes across 
as being sort of cautious and he'll think about it. 
I think more recently, and I don't know whether 
it’s in direct phone conversations or through Peter 
Biers , 1 think he's become quite positive 

about p th€ ^ idea of using multiple imputation; and 
I belrevej this is through conversations with 
Mr. B y^ ekef. That is as a result of his reading 
up on literature and finding it convincing. 

|S®ij ^ , y ou understood that Wecker submitted a 

H^orlf^^ing NMES , utilizing NMES in the Northwest 


Labor 



ase? 

In Northwest? Again, I -- 


►--\ 

I ' m sorry. I beg your pardon. Do you 
under^M W^ that Wecker submitted a report citing 

LJLj 

NMES, rfWI izing NMES to support his opinions, not 
just to criticize the plaintiffs' experts in — 
Strike that. 

Do you have an understanding that 
Wecker has cited NMES in support of his opinions in 
the Ohio laborers case, the ironworkers case? 

A. I do not know that. I don't know that. 

Q. Would you agree based upon your 
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conversations with him that without performing the 
multiple imputation analysis, which isn't done, 
that you would advise against him presenting data 
derived from NMES in his report? 


wrong 


MR. BIERSTEKER: May I have that 

questioj^read back? Because I'm not sure 
I fo 1 fewO jd it. There may be absolutely nothing 
wrongiwTTh it, but I do want to hear it again. 

MR. WITHEY: Sure. Go ahead. 

® (The reporter read the question.) 

MR. BIERSTEKER: I object to the J 

# §1^ t h gjM§Si|i b tion . 


MR. WITHEY: Sure. Go ahead. 


MR. BIERSTEKER: I object to the form 


to sa 


pr e s e 


handl 


uncer 



l it tj iem, 


Not necessarily, no. I would advise him 
t if he were to do those analyses and 


he should say that they don't validly 


missing-data problem. Minimally, 
y is going to be underestimated. 


presumably, because whatever method was used to 
adjust or take care of the missing values 
underestimated uncertainty. And there is also 
uncertainty about whether point estimates would 
change. 

That's different from advising him 
not to do it. My advice would be to be completely 
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upfront about the potential problems with it rather 
than sort of burying it and implicitly asking the 
reader to be on his toes and pick up that problem. 

Q. And that observation about being upfront 
comes from your training in statistics that when 
there problems with the dataset, you should 



identHy^hem forthrightly. Correct? 

Absolutely. And as an academic, that’s 


right 



publi 


should be scholarly about these things. 
And it wouldn't meet the standard that 
e exists for the reporting of statistical 
t put that out front? 

Yes; in all types of scholarly 

ns, sure. I am not at all a general 


expertjjy litigation, where the standards for 
repor||ffl|^y be different, but I'm talking about for 
the t that I write. 

Q. Did you tell him that, what you just 

said? 

A. Did I tell him that I would advise him 
to — ? I don't believe I told him that as advice 

because I wasn't aware -- Well, I was aware he was 
preparing, I guess, a document. I didn’t know he 
had already done so. Which is the implication. 
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►“3 


I guess maybe I should halt and ask: When you said 

"that," what does "that” mean, that I told him 


that ? 


Q. Fair enough. Did you ever tell Wecker 
that if you, meaning Wecker or his shop, was going 


to use>4 


S data, that they should put in their 


reporlr tf^t there is the missing-data problem and 
the generated data might underestimate the 




uncer i 


may n 




m pl 


n aJ iy because of the mis s ing-data, problem and 
valid or words to that effect? I don’t 
l^n you down to particular words. 


I don 


Probably words to that effect. But 
low whether they were focused on his 


repor-tn^sN think they were just comments on analyses 


twtmzsd. 

f 1 

Cl 


of NM 


whet h 


And whether he put it in the report or 
gave testimony in court, the advice 


would be the same to him. Correct? 

A. The advice would be in the respect that 
these analyses would underestimate uncertainty. 

I don't think I ever gave him advice on what he 
should wrote or what he should testify. In fact, 
I'm sure I did not. 

Q. And if Brian McCall called you up and 
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said "I'm going to use the NMES data in my report," 
you would give him the same judgment you just 
testified to. Correct? 

A. It is a judgment, correct; it's not 
advice. I mean, I don't believe I ever give anyone 
advice what to write or what to testify to. 

^And I assume you've expressed this to 
;eker? 

My judgment that any analysis based on it 
would pP^restinate uncertainty, would 

erd’fit'Iliate variability and may be off on point 
‘ ffSTim^PP^ yes. 

And that there may be missing data? 
Absolutely. 

r'""N 

And that the problem may be addressed 
throudlipiiiltiple imputation as it relates to the 
NMES tHibase ? 

A. Multiple imputation is probably the best 
«of the methods available to correct for the 
problem. 

Q. At the time of the Minnesota trial — 

You testified in that case. Correct? 

A. Yes, I did. 

Q. Do you recall being asked whether any 
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statistical analysis could be used to compare the 
smokers and the non-smokers in NMES in a way that 
would yield valid results? 

A. Do I remember being asked that? 

Yes . 

At the trial is your question? 

Yes. 

I don’t remember explicitly being asked 
it looks like you’re reading from 
and I have no doubt that I probably was 



And you’re certainly entitled to read 
icript on page 378. Let me just read you 
and you can look at it if you want to 
know the Jb ontext of it. 

Sure . 

Your answer was "I haven't done that 
investigation completely, but I — I believe there 
, are methods." 

A. Okay. Could I see that page and see the 
context of it? 

Q. Sure. Take a look at it. 

A. I mean, I know exactly in some — 

Q. This was on cross by Hamlin, I think it 
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was? and of course Mr. Biersteker was "there. 
Correct? 

A. Y es. 

Q. He put you on direct, right? 

A. Correct. 

And I’m referring to the part where 


Ln 

h-i 

KD 

Cn 

O'j 

KO 

4 *. 

O 



thereto c^n arrow at page 378 of the transcript that 

LXJ 

we hav^l^^I'm not sure it's the official 


trans 



_4 UaJ 


Just read that over, take your time. 
Sure, okay. I would have thought we were 


the context of regression analyses and 




1 fWi p e ipgliii|r score methods. (Pause) Okay, I think 
enough of the context; now I know where 

And correct me if I’m wrong now, but this 
which of course took place in — 

MR. WITHEY: Gosh, do you remember 

when it was, Peter? 

MR. BIERSTEKER: I think it was about 

a year ago. 

MR. WITHEY: About a year ago, fair 

enough. -a 

-*(585? 3f 

BY MR. WITHEY: 

Q. About a year ago you indicated you hadn't 
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mentioned, although I may not have, a book that Joe 
Schafer published in probably '97 that was based on 
his Ph.D. thesis. 

what is the name of that? 

Ap r It 1 s published by Chapman and Hall; it's 
someth^fncH like — this is going to be wrong -- 
Multip^Uariate Analysis With Incomplete Data, 


some 

S cfeH 


'ariate Analysis With Incomplete Data, 
[tation of those words. 

It was a book? 



this 


Stat i 


att ac 


Yes. 


Okay . 

Then I sent him -- I probably sent him 
JASA article. Journal of the American 
1 Association, an article that had 
rticles and discussion and rejoinder. 


basically all focused on multiple imputation, 
•although some people talked of other methods for 
handling missing data in surveys. That article had 
an extensive list of references on multiple 
imputation, both the theory of it and applications 
of it -- for example, the federal government -- 
that had taken place in the previous decade 
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primarily. And I may have sent him some copies — 
and I don't know — of proceedings articles that 
were jointly with people from National Center for 
Health Statistics on multiple imputation in NBANES 
that would not have been quite as accessible for 


him b 




e proceedings aren't as broadly 


distributed as the formal journals. 


books 


of MI 


pQ 1 

12 

HflL 

4*4 

|ma| 


CJj 

imput ^^^ i 


In any of the articles or reprints or 
. you sent him, was the subject of the use 
the missing data of NMES discussed? 

It's possible because I may have referred 
is that talked about, well, talked about 
i in NMES. The author was Sommers, 
who was at the agency. I think there 


was ° h e aj ticle that was pointed out to me, or 


maybe 
by Ha 


itel# r yeah, 

lD . „ 

prior t 


one article was pointed out to me 
o a deposition in Minnesota and 


SWb 

CM 


there was another article that was by the same guy, 
Sommers. And I think he actually mentioned 
multiple imputation, although it was not focused on 
it, but mentioned multiple imputation as a 
possibility for NMES. But I may have -- I may 
have given those references to Wecker; I may not 
have . 
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Q. And these were publications by Sommers, 
by yourself? 

A. No. One by Sommers, I think one by 
himself, and one was with someone else. And 
I don't even — I say I believe they're in 
procee^Mgs. But my memory is when I was handed 





in 


o 

CO 


the ope ‘3§t the deposition that Hamlin was doing of 
me itrTrac!| no date on it, it had no where it was 
publi|y|Jj|g|| on it, and I asked him and he responded 
like "I ask the questions, you don't.” 
have a clear memory of where it was 
It looked like it was a proceedings 

Let me ask you if you could do this for 
us. CouLJ i you pull the reprints, the articles, 
books feaaagJ oceedinqs . reports that address the issue 
of thIPwils of multiple imputation in missing-data 
analysis with respect to NMES alone? 

A. Purely with respect to NMES? 

Q. Yes. 

A. I believe there are only two that I am 
aware of . 

Q. All right. Could you pull those and have 
those copied and provide them to Mr. Biersteker? 
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rW* 

B W W 


I thi 



And we'll ask that they be marked as Exhibit 3 in 
in this deposition. 

A. I think the order would have to go the 
other way. 

MR. BIERSTEKER: The two Sommers' 

art ic you’re talking about? 

MR. WITHEY: If those are the only 

two. 

THE WITNESS: Specific to NMES; 

|hose are the only two. 

MR. BIERSTEKER: I think there were 

^^libprt^in the Minnesota case. I'll look and I'll 
be hapW -to send them to you — we don't have to 
invol professor Rubin -- if I can find them. 

MR. WITHEY: That’s fine. 

THE WITNESS: I don't even know if 

LLJ 

I havPwipies . 

MR. WITHEY: Okay. 

BY MR. WITHEY: 

Q. Did you have the understanding that 
whenever, if at all, the missing-data problem with 
NMES could be analyzed using the multiple 
imputation method, that whatever work product 
either you or Professor Schafer generated would be 


Ln 

h-* 

Ln 

cr\ 

VO 

O 

vo 
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12 

cu 


P™T 

o 

. - v : 

CL 


submission of reports in the ironworkers case in 
Ohio? 

A. That was a long question. Maybe I could 
have it read back, make sure I understand it. 

Q. Let me restate it. It was long. 

You had the understanding that if you 
or Schafer: were going to do the multiple imputation 


analy 


provi 


£ the missing data in NMES, it would be 
o Weaker and McCall? 


Yes . 


t 


And you understood that you yourself were 


o do what they were going to do in terms 


of ac 


analy 



y utilizing the multiple imputation 
o then generate data from NMES on smokers 


versusnoji-smokers and medical expenditures. 


Corre 


Correct. Except the last "data" you said 


should have been "analyses," I think. 

Q. All right. Thank you. 

A. So I was not going to be involved in 
doing their analyses. I was not going to do their 
analyses on the multiply imputed dataset which 
I delivered to them. 

Q. Right. And that's because they have 
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137 


their areas of expertise and you have your area of 
expertise. Fair enough? 

A. Correct. 

Q. And you weren't going to try to go beyond 
the discipline that you have in statistics. Fair 


enoug 


guest 


fiN 

rvi 

@6 

J»H 


MR. BIERSTEKER: Well, X think -- 

MR. WITHEY: I'll withdraw that 


In fact, the previous question, to be 


S arfTrlnean, they're going to do analyses and I'm 
ding their analyses. So they're going to 


do so 


I wou 



ings and maybe there are some things that 
, maybe there are some things that 


I wouldnpt do. I don't know what they have in 


mind 


i. I i 


». But I'm not controlling them. 

Let me just ask you if you understand 


some of the basic ways in which the NMES data was 
generated. With the exception of -- I’m sorry. 
What's her name, Nancy Maskewicz? 

MR. BIERSTEKER: Mathiowetz. 

Q. — Mathiowetz, have you ever talked to 
anybody who was involved with the NMES survey 
itself? 
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rj 

r*l 


JL £ 



nst 


ft 


A. Not that I explicitly recall, but it's 
possible. I'll give you an example of a situation 
where I might have and I just don't remember. At 
the annual statistics meetings that take place, 
there are many people who are there. NMES is a 


survey 




here's a section of the American 


Statia&i&al Association, just like there's a 


sectii 


a sec 


cnrcf£ 

AgS 

p£8888888S«8^ 


chair 


^pl 


a lot 


becau 


P . 



lied statistics in epidemiology, there's 
called survey research methods; and I'm 
hat section. I was chair of that section 
So there was a meeting of all the 
were members of that section and there's 
ocial conversation that takes place 
ere’s coffee before the meeting, and it's 


certainlyjpossible that I talked to some people who 


are i 


have 


ed in NMES about NMES there, but I don't 
cific recollection of it. 

MR. WITHEY: By the way, do you have 


.a spelling on Mathiowetz? 


MR. BIERSTEKER: I don’t, no. 


BY MR. WITHEY: 


Q. Have you ever reviewed the data, the NMES 


data itself? 


A. I reviewed the structure of the data but 
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individual records. 

Q. Were there predecessor studies to NMES? 

MR. BIERSTEKER: I'm sorry. Could 

you repeat that? 

MR. WITHEY: Let me restate. I'm not 


£ fished it. 



BY MR 



BEY : 


betwe 


Was an understanding of the relationship 
alth care expenditures and smoking an 


ly€xca?L objective of NMES? 


MR. BIERSTEKER: Object to the form 


of theMuestion. 


explici 


inf or 




I don't believe it was one of the 
stated objectives, but they did collect 
m that was relevant to that question. 
Were there predecessor studies of NMES 


that you are familiar with? 

MR. BIERSTEKER: Object to the form. 

A. I'm not sure what that means. 

Q. Well, in light of your problem let me put 
it this way. NMES was not one survey. Correct? 

A. My understanding is that that's correct. 

I believe there's a more current one than the one 
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12 
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that we were analyzing and I believe there was a 
previous incarnation. 

Q. Are you familiar with the national 
medical care expenditure survey conducted in 1986? 


A. Not offhand, no. I mean, X probably am. 


but it 


medica 


1980? 



Q 



ot coming to mind. 

Are you familiar with the national 
re utilization and expenditure surveys in 


Again, I may be, but not explicitly under 


Have you conducted any analysis of either 


of tho^rfwo surveys to determine the statistical 


validi 


f them, particularly as it relates to the 


mis sinpyiyjypjta problem? 

A lm ap l l certainly don't think so. 
g|1111IIP|D id they include questions on smoking? 

A. I don't know. 

Q. Do you know what the medical expenditure 
panel survey is, done in 1996-97? 

A. I don’t believe so. 

Q. Do you know if they had any questions on 
smoking ? 

A. I don't know. 
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u 


Q 

Ck 


name of the survey? 

Q. The medical expenditure panel survey, 
U.S. Government. MEPS, I think it's called. 

I ' VC seen that MEPS, but I don’t know 
what tabe Content is. 


done i 


issue 




the s 



Have you made any effort to compare MEPS 
96 and '97 to NMES as it relates to the 
smoking and expenditures? 

No, I have not. 

Do you know how the questions posed in 
for NMES were collected? 

How the questions were collected? 


I don ’at gjiit e know what the question means, but 


pr oba 


don ' t 


n almost any reading of it I probably 
. I mean any reading of the question. 


I probably don't know. 


Q. Do you know what goals drove the 
inclusion or deletion of specific questions in that 


survey ? 


So, do I know the process by which 


questions were included or not included in that 
survey? No, I do not. 
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Q. Do you understand that the framing of the 
questions and the inclusion of certain questions 
and not others was not only based upon research 
goals but also political processes? 

MR. BIERSTEKER: Object to the form 

of t h e^j aafli^ stion. It assumes facts not in 

evidenee."^ But if you have any understanding with 
regardriW that process. Doctor, you may answer. 

Could you start that sentence again? 

Qmfi|you understand in a survey certain people 
arffginerating the survey have to decide which 
'tiPn to ask, which ones will be included and 
which will not be included. Fair enough? 

^Correct. ’ 

^And you understand that there is a 
procesM^ which people filter which questions are 
going |PIWi|e asked and which are not. Right? 

A. Right. 

Q. Do you know that that process of 
including or excluding questions was based both on 
research goals as well as on political 
considerations? 

MR. BIERSTEKER: Same objections as 

I asserted previously. You may answer. Doctor. 


(.n 

h-> 

Cn 

CTj 

cn 
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M 

n 




L 2 

4i 



TS 

a 


A. I don't know really what's meant by 
"political" there. I do know that decisions about 
which questions to include on a survey are made by 
people and to the extent that people make decisions 
based on lots of reasons, that much I understand. 


So pe 




if you 
politi 
politi 





hou s e 




BMMMtWWftu 


nat iv 



some of those reasons are research and 
em might be termed political. I imagine, 
ed them, they would not claim they were 
Obviously, people might think they’re 
I do understand that which questions to 
a survey, those decisions are made by 
s, and individuals have various goals. 

Did you know that there was generally a 
component of the NMES survey? 

As I recall, yes. 

Weren’t there also American Indian and 
skan components of that survey? 


A. By components, you mean? I'm not sure 
what you mean by components. Maybe you could 
clarify that. 

Q. That there was a focus or an aspect of 
the survey that wanted to look at American Indian 
and native Alaskan health expenditures. 

A. To the extent that the weighting 
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have to b 
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s supposed 
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uld be rep 
it may we 
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ditures. 
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the extent 
I don't k 
supposed t 
sed to be 
bgroup had 
re trying 
e weighted 
epresentat 
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u 

r 

1 

c 

n 


n 

o 

r 


t 

/ 

i 

a 


sampled? 


to make it nat 
bgroups are pa 
esented. And 
1 be that thos 
ause there may 
their health 

t know that? 

I said if they 
ow what the we 
be representa 
epresentative 
some represent 
o do that, sma 
oversampled, 
on there. 
lized populati 


A. Institutionalized population? I d 
believe it was. But I'm not sure. 

Q. Let me ask you this: How many que 

were involved in the NMES survey? 

A. I don't know. 

Q. I mean, it is fair to say you have 
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achieved s 
analyzed s 
Correct. ? 

A. 


• z ej3®gd 


be more ac 
analy: 

tg£] 

on oneF^n 
ASA r< 


ho 




ome promine 
urveys from 

Designed an 
curate than 

Okay. In o 
the committ 
d to survey 
Yes, I was 
ction last 



cert a 
as to 
s e 1 ec 



So the gues 
within your 
a survey is 
what questi 


nee as 
a s t at 

aly se s 
promin 

ther wo 
ees or 
s? 

chair o 
year. 
tions I 
area o 
design 
ons are 


someone wh 
istical ba 

for it wou 
ence for h 

rds, you s 
subcommitt 

f the surv 

'm asking 
f expertis 
ed, how gu 
selected, 


o has 
sis. 

Id probably 
aving 

aid you were 
ees of the 

ey research 

you are 
e, correct, 
estions are 
et cetera. 


Is th afriii j j jii A ir enough? 


L 8 J 


Yes and no. I mean, that's not an area 
in which I consider myself a particular expert in 
the details of federal surveys or the list of 
federal surveys or how many questions they have or 
what the particular weighting structure is. My 
expertise is more focused on the general issues of 
what things should be thought about when designing 
surveys, what things should be thought about when 


JONES FRITZ & SHEEHAN 


http://legacy.library.ucsf.efiU>tiGt/pob|Q^alM>i^tf#.industrydocuments.ucsf.edu/docs/xygl0001 








1 

2 



H 4 
5 



fig ! 8 


>4 


12 


u 5 

pmk 

r^T O 


«9 

a 


21 

22 

23 

24 


80 


Donald B. Rubin, Ph.D. 


Q. 




taking care of missing-data problems. 

Q. Did 1 ask you this? How many questions 
were posed in the survey for NMES? 

A. Yes, you did ask that question. 

And your answer was? 

I didn't really know exactly. 

Hundreds, though. Fair enough? 

There were a lot of them, yes. 

Maybe even as many as a thousand? 

S Hj Do I think it was a thousand? It's 

if 

p 

! S 

?-T 

I wouSCat'ubt it, but it's possible 

How many questions on smoking appeared in 


VO 

On 

<71 


VO 

O 




siErrrp depending upon how you code the 
t:^^4 that there might be a thousand fields 



P* V ,Ba, v_ 

the self-Administered questionnaire? 

m l don't know. But I know it 
r, never, and former. 


at detail 

beyoni 

Q. Do you know there's only like a handful 
of questions, under five, about smoking? 

MR. BIERSTEKER: Object to the form. 

BY MR. WITHEY: 

Q. Or you don't know? 

A. I wouldn’t be surprised that there were. 
But sometimes — Let me give you an example. 
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But sometimes — Let me give you an example. 

I don't know the structure of the survey but I can 
ask one question: How many years ago did you last 
smoke? How many cigarettes a day did you smoke 
then? Or I could ask those same two questions in a 


form 



it could take ten questions by asking. 


you kftow)® yes/no kinds of questions: Were you 


smokii 


ago? 

smoki 


ve years ago? Were you smoking ten years 
you smoking fifteen years ago? Were you 
enty years ago? Instead of just the one 


# pMf 


st^ffp How many years ago were you smoking? 

|llll^{ So there are different ways of asking 


questi^p^, and I don't know how you count those 


que st ! 

quest lonsj 


So if you’re telling me there were five 


Ob 

ffeglgl 


know 


I *m not telling you anything. Do you 
ere were less than five questions on 


smoking in the self-administered questionnaire? 

A. Do I know if there were fewer than five 
questions? I know there’s some detail. I don't 
know exactly, no, I don't. 

Q. Let me ask you this: Do you understand 

the process of pretesting questions to be used in 
surveys of this kind? 
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A. In general, yes. 

Q. Describe that process. 

A. Well, it is common after initial 

questions are collected that a survey goes out to 
pilot study where they look at the response rates 


and t 


if t h 


respo 


quest 


inds of answers that are generated to see 
ake sense and if they can get decent 
rates on them. 

And is that recommended to pretest 


1 


An 



in surveys like this? 

It is usually a wise idea in an expensive 
do a pilot survey first. 

And can the researchers use focus groups 


for t 


to time? 


urpose of pretesting questions from time 


B Can they? They certainly can. 

What were the questions used in NMES 
related to smoking? 

A. I don't know the exact questions. 

I think that some of them were "Have you ever 
smoked in your lifetime a hundred cigarettes" or 
maybe it was the number of cigarettes that you 
smoked. I don't know. "Are you currently a 
smoker?" I mean, again, these are not the exact 
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pfi 1 


0J< 

S 

6 

pagaB i 

©a 


questions because I don't know the exact questions, 
but I'm talking about the kinds of information that 
one could get from the questions; and how heavy a 
smoker you are now or were, how many years since 
you quit, I believe. 


asked 


curre 



I don 


const 



How was the current smoker question 


I don't exactly know. 

Would a question that asked "Are you 
smoking at least one cigarette a day" be 
riate test of current smoking? 

Well, that's not my area of expertise. 
iow whether one cigarette a day would be 
1 by epidemiologists or medical 


researchers to be a serious amount of smoking. 


I wou 


exper 


ubt it. But that’s not my area of 


Q. It would require some judgment in that 
area of expertise, let's say epidemiology, in order 
to decide exactly how to frame that question then. 
Fair enough? 

A. That would be helpful, that's right. 

Q. Were the questions asked in NMES 
regarding smoking pretested? 
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A. I don't know for a fact. 1 imagine so. 

Q, If it wasn't, it would be a limitation on 

the nature of the survey. Fair enough? 

A. I don't know whether it would be or not. 
Again, that's for someone else to judge. For 
examplfipithere are some questions that probably 




aren'fc* pretested that people don't think limit the 

bSigJ 

survefr^For example, "Are you a male or female" 

I imag&yiyg is not pretested. And again, if somebody 
decidp®fpl|the epidemiologists, or it’s understood in 
Sfj g f Infra that a question in other contexts has a 
ar|idbli|ugh answer like "How many cigarettes do 
you s rrKqce per day," maybe from other surveys they 
know IPhtH;. I'm just not an expert in that area. 

Did you ever discuss with Nancy 
Mathi bawi z whether the authors of the NMES study 

Qj 

anticH^P|ed that the NMES data would be used to 
determine whether an association exists between 
health care costs and smoking? 

A. I didn’t have a one-on-one discussion to 
that effect, but I believe I remember that point 
coming up in a telephone conversation. 

Q. And her response was what? 

A. That it was not originally designed to 
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address -that specific question, although in part of 
that conversation that always happens with 
surveys. Surveys are used to address all kinds of 
questions that they originally were not designed to 
answer. 

fe But given the fact, as you understood 

that the authors did not anticipate that 


the di 


assoc 


smoki 




rould be used to determine whether an 
>n exists between health care costs and 
hose questions weren't then given a high 


sin the array of questions that were being 
jould you agree with that? 

^ That I don’t know. And in fact X don't 
| the survey didn't have as one of its 


know 


objectives to look at the relationship between 


smoki 


the s 


d health care costs. I do remember that 
was not designed, I believe, to estimate 


these relative risks of smoking having adjusted for 
background variables to try to get the causal 
question. I think that’s right. But whether — 

Q. It wasn't designed for that purpose. 
Correct ? 

A. It was not designed, as far as I remember 
the conversation, to address the causal question; 
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L> 


QJ, 


i^l^9 


although I don’t remember hearing conversation 
about whether it was or was not partially designed 
to look at the associational question. 1 don’t 
remember. 


least 


or we 


Q. I would assume you would defer to her at 
; ^^allection of what the design purposes were 
! fVT t, because she was an author. Fair 




enoug 


Well, 


jgXKJvMg Jpaty> t 

no. it 


Certainly with respect to design — 

>n’t know whether she was an author. If 
>resenting that she was an author — 


hought you said she was involved 


in th 



cess. 


She was involved in the process. She 


worked there. I don't know whether she was an 


aut ho 


not. 


Okay 


A. But she certainly has more knowledge, 

. I believe, than I do, which is very little, about 
the process that was used to select the questions 
that were included. That's a different issue, of 
course, than the analyses that one can do on the 
data once they are collected. 

Q. Would you agree with the proposition that 
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Q. Assuming Nancy Mathiowetz says there was 
no intent in collecting the NMES data to determine 
whether an association existed between health care 
costs and smoking, that therefore the data should 
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not be used for that purpose? 

A. I do not agree with that. 

Q. So if she were to testify to that or 
state that in a report, you would disagree with it. 
Correct? 

Yes. Solely for that reason. If the 
reasoEr w% it should not be used is because there 
was noB|tenti 0 n when they collected the 
infor|^|on, that’s not a sufficient reason to not 
that purpose. There may be other 
>ut that’s not sufficient. 

Now, let me see if you can agree that 
the following potential sources of error 
with NMES. Okay? I think you've 





assoc] 

alreadvjjestified about nonresponse bias. Correct? 
In th|^|iifrthwest Laborers case. 

Right. I mean, that's a form of the 
missing-data problem. 

Q. There were measurement errors. Correct? 
A. Well, there are always measurement errors 
in a survey to the extent that some people may 
misunderstand questions. 

Q. Were there measurement errors as it 

related to associated with health care utilization 
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i 


^F: 





re spe 


and expenditures? 

A. Presumably. 

Q. Were there measurement errors associated 
with reporting of smoking behaviors in NMES? 


A. Presumably. 


Were there measurement errors with 
> other health behaviors in NMES? 
Presumably. 

Were there measurement errors with 
■> the issue of Medicaid recipiency? 
Presumably as there are in all surveys. 
r s are subject to errors of reporting. 
Well, but all surveys don’t look at 
recipiency, do they? 

No. But the caveat is "all surveys" 

.y implicitly to all my other answers as 


respe 


rmi 



Medic 


would 


well 



Q. Would you agree with the statement that 
given these error sources and potential bias, that 
the NMES data is not of high enough quality for the 
purpose of determining the existence of and 
quantifying the extent of the relationship between 
health care costs and smoking? 

A. Not necessarily, no. 
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Q. So you would disagree with that opinion 
if expressed by an expert. Correct? 

A. Let me be clear about what it means to 
disagree. X cannot agree with it. Okay? That is 
not necessarily the same as disagreement. 

Hell, do you disagree with that statement 

then? r“ 

LXj 

I cannot agree with it. Do I disagree 
with Ch^^ ay it is stated? Yes. And let me be 
clear I’m saying. What I think is I don't 

iJU 

ifljrf ieveiThe opposite, necessarily. I am not 
1 P!^imji$l|^|:hat the errors are so small, the quality 
is so SSWi that it can be used to support it. 

Okay? not claiming that. 

h—\ 

G^iwhat is your opinion? 

K That I would rely on people who know the 
the data better than I do. It is not 
one of my areas of expertise. 

Q. Fine. 

A. But because of the reasons that are 
given, I can’t agree with it. 

Q. Let's say one of the authors or let’s say 

Nancy Mathiowetz made that statement. I’m not 
saying she did. But if she did make that 
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XI 1 



Qs 

CL 


statement, would you defer to her because she was 
involved at least in knowing how this data was 
collected? 

A. I would defer to her as one of the people 
who knows more about it than I do, but I wouldn’t 


neces 


0 M 1 


y solely base my position on what she 


says, f" Tli^sre may be other people who were even more 


invol VedTf I don't know, who would say it is good 


enoug 


QgL 


1 ity . 

Let me ask you this. You understand that 


saesigned to look at the national 


j aps} . 
popu I 


was U 


Correct ? 



state 


say O 


C orrect. 

Is the sample size sufficient to make 
si estimates in a state as small as let's 


It depends upon how the analyses are done 


and what assumptions we are willing to make 
explicitly about it. 

Q. Do you have an opinion then whether it 
could be applied to the state of Ohio, the NMES 
database ? 

A. I believe that if the analyses are done 

,ig*Ke*v ■' 

carefully, that it probably can. But I would have 
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N ““5 


to look more carefully at that. 

Q. Meaning with multiple imputation for 
missing data? 

A. Well, I wasn't necessarily thinking about 
that as the key ingredient. It's more not just 


using people who live in Ohio to do the 

analy®esNbut doing things that use information in 



an ap] 


Shall 


riate way from other states as well, 
ry to clarify that for you? 


Q l 

_12 

SJ 

u 


Men 


Oklah 


small 



Okay . 

Are you aware of any critique in 
let's say, because that's a relatively 
e, are you aware of any defense expert 


who h as ci ritiqued one of the plaintiffs' expert! 


ft, 


use o 


popul 



S data nationally to apply to the 


within Oklahoma as being inappropriate 


or invalid? 


MR. BIERSTEKER: You mean the whole 


state of Oklahoma or the Medicaid population? 

MR. WITHEY: Medicaid recipients in 

Oklahoma- Thank you, Peter. 

A. Yes, I am aware of criticisms of that. 
Q. By defense experts. Correct? 
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A. Criticisms of the analyses that were done 
in applying it f yes, that's right. 

Q. Do you know which defense experts made 
that criticism? 

A. I don't know for a fact. My memory is 

I expect that Wecker did. Maybe McCall, 
there is, Oklahoma also had the feature 
's a fairly heavy percentage of American 
Oklahoma that made the Oklahoma Medicaid 
subpo^Hl^ion more different from the national 
Bic^^subpopulation than most other states, 

3 93SSSS Be 1 f BH| 

Do you know if that criticism would apply 




to th^F™Erp i f}lication of NMES to a subpopulation of 

1 \ 

trust l1 .hM beneficiaries in the state of Ohio? 

Well , the specific criticism about 
AmeritfffrTtndians probably would not. The more 
general criticism has to do with the extent to 
which the national -- We're not talking about 
Medicaid, we're talking about union now. Right? 

Q. Yes. 

A. -- how the sort of national population 
differs in important ways from the Ohio and how one 
does the adjustment for those differences. 


Crr 

I-k 

■sO 

Cn 

<Tj 

u> 

CO 
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Q. So you’re saying it is possible to do it 


right ? 

A. Yes. Well, I’m saying 
possible to do it right. I mean, 
into the data more, but it is pos 
right e issue has to do with 



I believe it is 
we'd have to get 
sible to do it 
how different the 



- like the t 
general po 
union I do 
still has 
has to be 
Q- 


e group is, the,.rest of the population, 
opulation you're interested in on really 
background variables. And in Oklahoma 
is American Indian indicator which is a 
the sense that they really are very 
with respect to that variable. And how 
he modeling to use other information 
e subpopulation you really care about, 
r and farther the rest of the group gets 
arget population, the harder it is to do 
is well. 

I was saying in Oklahoma it looked 
arget population was quite far from the 
pulation in NMES whereas in the Ohio 
n't believe it is that far away. But it 
to be done carefully. The analysis still 
done carefully. 

You agree that the 1987 NMES survey was 
JONES FRITZ & SHEEHAN 


http://legacy.library.ucsf.e^WticKpot|6^a0jQ)ipel^.industrydocuments.ucsf.edu/docs/xygl0001 







Qs 

ft 


Donald B. Rubin, Ph.D. 


designed to produce national and regional estimates 
of health care utilization, expenditures, sources 
of payment, health care insurance coverage for the 
United States non-institutionalized population. 


Correct ? 




MR. BIERSTEKER: Objection, compound. 


BY MR 



I mea 


EEr 

rVVvJViW® 



the s 


level 


statei 




THEY i 

Do you agree with that statement? 

To the extent I understand it, yes. 
think the answer is yes to what you're 
ask me. 

Would you agree with the statement that 

e size is insufficient for making state- 

•****.■ 

imates except for seven of the largest 


Not necessarily, no. 

Was this a complex design? 


A. The survey design? 


Yes. 


A. Yes 


It was stratified. Correct? 


Correct. 


It was clustered? 


Correct. 
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H i 

| 





T* 

gjk 


yes . 


Q. It. had what are called multistage areas? 
A. Correct. It was a multistage survey. 


Q. It was a probability sample design? 

A. Correct. 


And did they selectively oversample any 


policy—r®|Levant groups? 

I believe they did, yes. 


jQi ffi- 

■Vk .wmoonoY. 

12 ® PPl 

minor 


Which ones? 

I knew that was coming. Do I actually 


fact. 


way t 


P®T?i^Ll, typic ally what’s done but I don't know 
small groups, so I would expect that 
:r0 ups were oversampled. 

Hispanics? 

r— 

A. J I would expect so, but I don’t know for a 

® talking from general knowledge of the 
surveys are done. 

Q. Would you again defer to one of the 


authors or perhaps Nancy Mathiowetz as to what 
relevant policy groups were oversampled? 

A. I would defer to hundreds of people on 
addressing the specific details of which groups 
were oversampled. 

Q. Were the functionally impaired 
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overs ample d 7 


A. Do I have any idea? Hmmm. I just don't 


know. 


Q. Do you know what software package was 


used? 


To do what? 

|K ■'^j To generate the data. 




I don't know what that means. Software 



packaWyare usually to analyze data, so I'm not 


To analyze data, you're right. 

Okay. Who's doing the analysis? 

Yes. 

No, I’m asking you. I don't understand 


Do you know what software package was 
o analyze the data generated from NMES? 
MR. BIERSTEKER! By whom? 



the q 


ut ili 


Q. By anybody. First of all, let's say by 
NMES or by the authors and researchers that were 
involved with NMES. 

MR. BIERSTEKER: I will object to the 

form of the question. I don't understand it. 

A. I don’t really understand it. I can try 
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Qgi ) 3 

r,. .. 4 

“s 



T"L 

■ag83 a »x 

Q< 

s , 

p^pc 


to make an attempt to be helpful. I mean, there 
are several — 

Q- I’m going to get to Keeker and what 
software package he used, if you know. 

A. No, I don’t know. 

You don't know, all right. Is there a 
softw|fre'%)ackage spelled L-I-M-D—E—P? 


going 


S-U-D 







in th 
analy 


and t 


pr oba 



I don’t know. The one that I thought was 
e mentioned is SUDDAN, which is I think 
N, all capitalized, which is a commonly 
are package for the analysis of complex 


Do you know if there are any assumptions 
ftware package that the data to be 
as obtained from a simple random sample 
ample cases were selected with equal 


A. Which package are you referring to? 

Q. The SUDDAN. 

A. No, no, I believe that is explicitly 
designed to deal with complex surveys. 

Q. So you would recommend SUDDAN be used as 
opposed to a more simple software package or a 
software package designed to analyze simple sample 
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>i 

i 

12 

“3 



- 

B 

G«j 


collection. Correct? 

A. Not necessarily, no. It depends upon the 
analyses and the purpose of the analyses. 

Q. Is there a department of biostatistics at 


Harvard ? 


Yes, there is. 

p. ^ Are you on that faculty? 


bios t 


.£4 


No, I am not. 

What is the difference between 
tics and statistics? 

Well, there are two answers to that 
One is at Harvard and one is generally. 
At Harvard the-department of 


stat ipPfc4^s is in the faculty of arts and science. 


which t isjthe same as the college and the graduate 


schoo 


biost 


at grants Ph.D.’s. The department of 
tics " is in the school, of public health .and 


does not grant Ph.D.'s; it grants doctor's of 
public health and master's degrees. Our department 
is about seven faculty, all of whom do teaching. 

I think their department has, I don't know, fifty 
faculty, most of whom are supported on soft money, 
grants, and do contract and grant research. 

In general, statistics as a field is 
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Q i 

12 

^Kgssgss! 

X?3 

<u 

CJ 

J2L5 

P^f% 


&* 

On 


broader than biostatietics, although biostatistics 
as a field is deeper and more focused on particular 
kinds of problems that arise in sort of biomedical 
research. For an example I think most of the 
tenured faculty, maybe all the tenured faculty, in 


the d 


ment of biostatistics at Harvard have 


their pdeg&rees from the department of statistics 


And I 


f acul 
their 


p^rrnrh 


a* 


k that's generally true, that the jt enipJr 
biostatistics departments tend to have 
.‘s from departments of statistics, not 
s of biostatistics. 

That may be changing in time, but 


biost^^^ica departments tend to be more applied 


in t h< 


ise of dealing with particular problems 


and n Qt qj xite as sort of -- this is just a 


tende 


biost 


I’m not criticizing anybody — but the 
ics departments tend to be less, quote, 


scholarly academically and more oriented toward 
dealing with real problems. It is a huge field 
because they're involved in all kinds of specific 
problems that are very important. 

Q. And so I assume the obverse is true, that 
people with degrees in statistics as oppocr&d 1 * to^ * 
biostatistics do not have that extra discipline of 
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degr 
unde 
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bios 

cons 

Bob 

the 


biomedic 
gh ? 

A. No 


or , 


biostat 



Ph.D. from H 
* biostatistic 
University o 


Rod Little, 
statistic s, 


it may be ve 


Donald B. Rubin, Ph.D. 

al side of the discipline. Fair 

. Because, for example, how can 
not " I . *' How can someone be chairman 
istics department, having gotten their 
a department of statistics, and not 
he discipline of biostatistics? The 
rman at Harvard, Nan Laird, got her 
rvard in the department of statistics, 
chairman at Johns Hopkins, Scott Zeger, 
. from Princeton University, where 
and never has been a department of 
s . 

And I could probably continue 
names of chairmen of biostatistics. 

, who for many years was chairman of 
nt of biostatistics at UCLA, got his 
arvard in statistics, not 
s. The current chairman of the 
f Michigan biostatistics department, 
got his degree from Imperial College in 
not in biostatistics. 

I could run „down the line. I mean, 
ry difficult to find a chairman of a 
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biostatistics department who got his Ph.D. in 
biostatistics. Maybe not very difficult, but of 
the prestigious ones I think it's uncommon. So 
I just can’t agree that they would not know 
something about the field because they got their 
Ph.D. , ijw&t at is t ic s . 

{W > But I didn't ask that. 

TV 1 

Well, it was something like that. 

I said is there an additional field of 
study| biostatistics involves that statistics 

j Sn l ®4.s my question. 

MR. BXERSTEKER: Are you asking the 

same .ion? I'm-sorry. 

MR. WITHEY: That is my question. 

I donitcire whether I asked the same question. 
That » question. 



7 T r 
|nmf 


THE WITNESS: Okay. Would you ask it 

again? Read it back, please. 

(The reporter read the question.) 

A. Not an additional field. There are 
topics that biostatistics departments will focus on 
and go into more depth than typically is done in 
departments of statistics. For example, there are 
many Ph.D.'s in departments of statistics who have 
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no interest in or expertise in biostatistics, but 
then obviously there are some who have a great deal 
of interest and a great deal of expertise. 

Certainly if they go on to become chairmen of 
departments of biostatistics in prestigious 


umve 




ies they must have some knowledge and 


intersflt%in biostatistics 


was x 


stati 


HfH 

CL 


So the original question, I think, 
sive of all people who get Ph.D. 's in 
s, and that’s why I went into my long 


Ib there a class in Harvard called 


biost 





biost 



tics ? 


Is there a class? You mean a course? 
Yes, a course. 

A course that's just labeled 


tics, you mean? 


Q . Yes . 

A. Well, I believe that all the courses in 
the department of biostatistics will have labels on 
them like Biostatistics 1222 or Biostatistics 101. 

Q. Have you ever taught that course? 

A. I’m making up numbers. I actually have 

no idea. Have I ever taught a course in the 
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pQ 1 


Tl 


^9 


department of biostatistics? No, I’ve never taught 
a course. Have I been invited to? Yes, I have. 

Q. You are not an expert in modeling health 
care expenditures. Correct? 


A. Well, I wasn't a few years ago. Given 




model 


to th 


tepilr 

yCT 

p^n^r 


I ’ ve 


epide 


ironw 


what read, maybe I am. 

©% Have you ever offered expert opinions on 
modelffigjiealth care expenditures? 

MR. BIERSTEKER: I'm going to object 

to th^^^rm of the question as being vague. 

112 MR ^^WrHEY : 

You can answer the question. 

Well, in some sense I think that's what 

I've IPe'ttwf doing. 

- - 

0. J Have you consulted with any 
epidei to^ agists about your opinions in the 
ironwjff B ik- s case? 

A. Specifically in the ironworkers case? 
Probably not specifically in the ironworkers case. 

Q. Well, you didn't consult with any 
epidemiologists, for instance, in your testimony in 
the Minnesota trial, did you? 

A. I don't think that’s accurate. Consult 
in the sense of getting advice, maybe that's 
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M 



bn 



__i 2 

OA 


“ 

gs 

'll*®® 
<*> ■ 


correct. But did X talk to people who are - 

I don’t know. But, X mean, it also clearly depends 
upon what you classify as an epidemiologist. 

Q. Let me put it this way. Have you gone to 
an epidemiologist and said "Here are my opinions. 


X wan 



did y 

|2jjee 


Mud 


discuss them with you”? 

No, I don’t believe I have. 

You didn't do that in the Minnesota case. 


I don't -- Well, if I haven't.... Oh, 


* 


at you're asking. The first question is 


ct to Ohio? 



Yes . 


belie 


VL^Jf 


exper 


No, I have not. And Minnesota, I don't 
did . 

A year ago you did not claim to be an 
the modeling of health care problems in 


the Minnesota trial. Is that correct? 

A. I don't remember. 

MR. BIERSTEKER; If you have a 
particular portion of the transcript you want to 
refer him to, Mike, why don’t you do that? 

BY MR. WITHEY: 

Q. Page 348 of the same transcript. Let me 
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see if I can show you this question: 

"Now, you're not an expert in the 
modeling of health care problems. Correct? 

"Answer: I wouldn't claim I'm an 

expert in that, no." 





Right there. 

Where is that? In the modeling of health 
ems, so not expenditures. 

Was that truthful testimony? 

Oh, yes, yes. 


Is it still? 


pai|wen, I don't 


know. It depends upon what 



you m 


expen 


than the 


piece 


asked 


y problems. I mean, if it was 
es, I certainly think I know more now 
And the fact that I have written this 
have been asked to present it and been 
resent it at CDC, maybe some people would 


regard me as an expert in it now. I certainly know 
.more about it now than then. 

Q. Do you hold yourself out to be an 
historian? 

A. Not even close. 

Q. Do you hold yourself out to be someone 
who has special training in the history of the 
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tobacco industry? 


A . No . 


Q. You are not a medical doctor? 

A * Correct. 

Q. You are not an expert in the diseases 


cau s 




smoking? 


Correct. 



Harva 


to d 



You are not a health care economist? 

That’s correct. 

You are not an economist. Correct? 

I am not trained as an economist. I’ve 
urses in the department of economics at 
graduate courses on causal inference, how 
sal inference. I have been asked to give 


semiBaraj a 


it the Kennedy School on causal inference 


; AsBBted s h 

□c 


g, by economists. 

You are not a public health expert 


Correct? 


A. No. 

Q. Now, you would agree that, like 
statistics, each of these disciplines that I’ve 
just mentioned has tools of analysis. Fair enough? 
A. Correct. 

Q. And each of these disciplines use those 
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tools of analysis in making judgments and giving 
opinions in their field. Correct? 

A. That's a general question, so I don't see 
how you could answer anything but yes to that 
general question. 


of an 



epide 



>% 

12 

41 


econo 


any o 


ij g. 


| And you have not used any of these tools 
is of these other disciplines, historians, 
bgists, medical care doctors, health care 

I 

p, public health officials, in rendering 
^ opinions that you have reached in this 
I that correct? 

| MR. BIERSTEKER: I object to the form 


of th^quisstion. 


’ o 
a 

Cl 


you u 



No, it’s not correct. 

Which tool of analysis for history have 


Well, I don't even know what that 


question means, tool of analysis for history. 

Maybe if you could give me an example of a tool 
I could better answer the question. 

Q. Well, I previously had asked you whether 
you agreed that, like statistics, the other 
disciplines I'm referring to, history, health care 
economics, et cetera, have tools of analysis used 
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in deriving judgments. 

A. Presumably they do, yes. 

Q. And I'm asking whether you in your own 
mind believe you have applied the tools of analysis 
of an historian in making any judgments in this 
case 

As historian, no. May I clarify the 
lut I think before, you asked whether 
sd any of the tools in history, and I just 
whether I have. Because I have memories 
iast that historians have sometimes done 
il analyses and claimed they are 

tools. In fact, I now remember one. 

>way wrote a book on historical attitudes 
iientific revolutions and he has an 
appen levoted to statistical methods, including 

multi ffP^ iiiputation . 

Q. You brought the tools of statistics and 
your training as a statistician to a particular 
problem that you were asked to relate that to. 

Fair enough? That’s what I hear you saying. 

A. Yes. 

Q. In other words, you didn't say I’m an 
historian, I’ve reviewed this history and I've made 
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these judgments based upon the discipline of 
history and what they look at and their tools of 
analysis. You didn't do that, correct? 

MR. BIERSTEKER: Can I just object? 

MR. WITHEY: Sure. 

MR. BIERSTEKER: And if you want me 

the objection, I will. I'm having a 
ith the form of the question. If you 
e some input. I'll be happy to provide 



to cl 
pr obi 
would 




more . 





not - 
BY MR 



ob j ec 



MR. WITHEY: Mo. Go ahead. 

MR. BIERSTEKER: And it assumes facts 

THEY : 

Do you understand the distinction -- 
MR. BIERSTEKER: May I finish my 

? 

MR. WITHEY: Sure. 

MR. BIERSTEKER: And it assumes facts 

not in evidence. Thank you. 

A. I don't know whether, for example, in the 
field of history they would now consider for an 
example the tools that Frank Soloway uses, 
statistical tools, as now part of the tools of 


JONES FRITZ & SHEEHAN 


|ttp://legacy.library.ucsf.efiU>tiGt/pob|Q^alM>i^w#.industrydocuments.ucsf.edu/docs/xygl0001 


51956 9450 







Ill 


Donald B. Rubin, Ph.D. 


historical analysis. It was certainly used to 
analyze historical questions. 

Q. You don't know that? 

A. No, but it might now be. They might now 
take this as a tool in history. The same way 


I und 


i n so 


tool 





of hi 

ggch 

t o 

have 




nd that multiple imputation is now used 
litical sense and now may be considered a 
litical science. 

I’m not asking you what might be a tool 
. I'm asking you whether those things 
understand now to be a tool that 
use or analysis that historians use, 
ourself used those? 

I don’t know because I don't know what 


the toolkit is that historians think they have 


avail 



to them. I may have, I may not have. 
How about epidemiology? Have you used 


the tools of analysis of epidemiology, the study of 
.diseases in populations, and an expertise in that 
field to give opinions in this case? 

A. The part of epidemiology that uses 
statistics, yes. 

Q, How about the part of epidemiology that 
doesn't use statistics? 
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jd 3 





Os 

JLjfg 

O* 


A. The part of epidemiology that doesn't use 
statistics, that doesn't analyze data, no, I have 
not used those tools. 

Q. You understand that it takes judgment as 
an epidemiologist, for instance, to determine what 


di sea 


ight be caused by or associated with 


smokip g?^ That is an issue that epidemiologist! 


deal 


Correct ? 


I don't understand the question. 


Do you understand that epidemiologists 
Ser^liions on whether smoking is associated 
iWCh ftttia^auses diseases. Fair enough? 

And that is not a field that you are an 

SXper ^^5* Correc ' t? 

S Not correct. 

You are an expert in determining whether 
smoking causes diseases? 

A. I am an expert on issues of drawing 
causal inference from observational data. 

Q. I didn’t ask you that. 

A. I believe you did. 

MR. BIERSTEKER: Well, I believe you 

did. Don’t argue with the witness. 


expert_ a 
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cause 


BY MR. WITHEY: 

Q. Do you hold yourself out to be an expert, 
because you testified you only had a lay opinion in 
Washington, so I want to know this: Do you hold 

yourself out to be an expert in what diseases are 
smoking? 

That's a different question. Do you want 
rer that question? 

No, it isn't. 

MR. WITHEY: Let's read back the 

asked you originally and we'll see if 
ferent question. Professor. 

(The reporter read the question.) 

BY MR J^W'T't'HEY : 

Are you an expert in what diseases are 

cause^Ai smoking? 

Til 

No . 

Q - Do you understand that determining what 
diseases are caused by smoking is a function of 
epidemiology or medical science? 

A. Say that again? I’m sorry. I misBed it. 

Q . Do you believe that a doctor can draw a 

conclusion about what diseases are caused by 
smoking? 
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A. He can make an inference and he can 
draw — Is he able to draw the conclusion that, 
that's valid, or may he? I'm a little.... 

I really am.... I'm not trying to battle with you, 
but I’m just confused about what you're after. 


Do you go to a doctor from time to time? 
^ Yes . 

TVJ 

Does the doctor ever tell you "I've 
diagn<yy!g|i this disease"? 

£ Do I have a disease that I've ever had 
Yes . 

I'm not going to ask you what it is. 

ISTj No, that's okay. 

Do you believe the doctor then has 

trainin g J Ln order to make that opinion? 

[pHHr 

a To diagnose the disease? 

Yes. 

A. Yes, he has training, yes. 

Q. Does an epidemiologist then have the 
training to determine whether diseases in certain 
populations are caused by certain factors? 

A. The epidemiologist has training to 
address those kinds of questions. 

Q. But doesn’t have the training to make 


diagn 


train 


inq J L 

puip^ 
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conclusions about it? 


H 4 

5 

BSt^ 

>% 

jgV 

P™ 

12 

Pf»f 

W 3 

tj 

El 

pwH ^D 

HZSb 


5^9 

Q. 

pgg S saEK; 


A. The conclusions are always uncertain to 
some extent even if they are based on real 
experiments, and they are certainly more uncertain 
when they are based on observational studies. 


Q'Zlmd Have you applied the discipline of an 
epidemiologist in reaching any of the conclusion! 


you ' v 


conti 


jniL^ 



ob j ec 


ob j ectioj 


ched in this case? 

MR. BIERSTEKER: Let me have a 

objection to this line of questioning 
the questions assume — 

MR. WITHEY : We don't need a speaking 


MR. BIERSTEKER: It's not a speaking 

Come on, Mike, give me a break. 

MR. WITHEY: No, I was called on the 


carpef^ipterday by Grossman for saying anything 
other than "Objection to the form of the question." 
So what I do is say objection, vague; objection, 
misleading; objection, assumes facts not in 
evidence, and give the form of the question, to be 
helpful. What is your objection, counsel? 

MR. BIERSTEKER: Fine. My objection 

is that you are assuming facts not in evidence. 
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that tools are completely distinct between the two 
disciplines, whereas in fact they may overlap. 

MR. WITHEY: That's a horrible 

example of coaching. That's just coaching. 

MR. BIERSTEKER: It's not coaching. 

MR. WITHEY: In fact, it's a great 

exampr e_ ^ f coaching. 

MR. BIERSTEKER: It’s not an example 

of c o ^|g§yj|Ti g, it’s an example of trying to clarify 
the q fPSpPEli o n . 

THE WITNESS: Can we take a break? 

MR. WITHEY: We’ll take a break in a 
coup 1 minutes, unless you actually need one 

right FitowL 

►—\ 

THE WITNESS: I would prefer one 

right 

MR. WITHEY: We’ll take a break in a 

couple of minutes. 

BY MR. WITHEY: 

Q. Have you applied the tools of health 
care — Have you applied the analytical tools of a 
health care economist in rendering any opinions in 
this case? 

MR. BIERSTEKER: Objection to the 
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H - 



>% 


dLJ 

tj 

^1! 

iS&JWiJ' ( 


Iwjjfg 

0"% 

iHm 


form of the question in that it assumes facts not 
in evidence - 

A. I have applied statistical tools that are 
also used by health care economists that overlap in 
doing my analyses. 


oj Aid What tools? Do you know any tools that a 
healtir- c^re economist would use in analyzing the 


issue 


ref*! 


kinds 


damages in this case? 

Have used or would use? Would use the 
modeling methods, regression kinds of 


hongT^propensity score methods that are methods 


those 


pr ope 



ire economists would use. So have I used 


have not done analyses using those. In 
score methods I have used I've described 


the'kinda of analyses that people should use, 

piilllF 

inc lu Mmm multiple imputation, that health care 


econo 


s have used. So I don't know how to 


distinguish that. There are techniques that health 
- care economists have used that I have also used or 
propose be used. 

Q. Are there techniques and tools of 
analysis that health care economists use that do 
not involve statistics? 

A. Insofar as they are analyzing data and 
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2 l 
22 

23 

24 


BY MR 



drawing inferences from data, 
as using statistics. 

MR. WITHEY: Do 

the question, please. 

(The reporter re 

HEY s 

Do you want to answe 
Insofar as the metho 
he analysis of data o 
analysis of data, the 
ti s. If they are drawi 

sed on the analysis o 
statistics. 

Well, you've read th 
this easel . Are there any area 


invol 

from 

Sj 11 

’ w4n • 

not u 




that taction. 


Do you bring any 
discipline in health care to y 
case? 

A. In health care, mean 
care for the sick or how peopl 
getting sick? 

Q. Yes. 

A. No, I don’t have any 


I would regard that 

you want to repeat 

ad the question.) 

r that one? 
ds they're using 
r drawing inferences 
n they're using 
ng conclusions that 
f data, then they're 

e expert reports in 
s of — Withdraw 

training and 
our opinions in this 

ing how people should 
e should avoid 


expertise in advice 
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on maintaining your health or treating people once 
they are unhealthy. 

Q. Do you bring any expertise in health care 
utilization, that is, how people utilize health 
care services, to your opinions in this case? 


!7“!|gggg I think the best answer to that is no, 


I don'st* 


Do you bring any expertise in how health 
ditures are paid for and by whom they are 


paid -tj^pgpur opinions in this case? 
No. 

is there a doctor that yoi 


someo 


smoki 


I've 


smoki 



Is there a doctor that you rely upon, 
ained in medicine, for a judgment on 
id disease? 

I don't think so because I don't think 
ed opinions, professional opinions about 
id disease. I have a lay view about 


smoking and disease. 

Q. So you would not consult with any 
particular physician or doctor on the issues of 
smoking and disease that you could name. Correct? 

A. That's not what I'm doing. I'm talking 
about — It's like asking me whether I've 
consulted with anybody on something that I don't 
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Lot/ 
C Ss 

Pjjs 


>% 

12 

pmmm4 

ns 


ns 


ft 


do. What I do professionally is talk about how to 
properly analyze data to address specified 
questions. 

Q. If 1 were to ask you. Professor, could 
you name a person that you would consult if you 


were 


mind to do that on the topic of diseases 


cause C TL^§ smoking, can you name a person that you 


would 


i rsro 

AlJ 

rr! 


kinds 


mUki 



what 



questions of or consult with? 

If I were to talk to somebody about what 
iiseases are presumably associated with 
id caused by smoking? 

Yes. 

This information would only come from 
read in this litigation broadly, but 


there t ar ej people I guess such as Samet at Johns 


Hopki 


arris seems to know a lot about it. 


Dr. Jeff Harris? 

A. Yes. I think some health economists, Joe 
Newhouse seems to know a lot about it. Let's see, 
are there other people? 

Q. That's sufficient. We don’t need more 
than three names. Thank you. 

A. Okay, sure. 

Q. Do you consider the Surgeon General of 
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the United States an authority on the issue of 
smoking and health? 

MR. BIERSTEKER: Object to the form 

of the question. 


gener 


Gener 


the t 


at&MBte 


Sal 

. y ffff fflBflPtW « B L p i p in ffl -nor— 


not a 


offer 



Presumably, yes. I'm sorry. In a 
snse , yes. 

And would you agree that Surgeon 
reports offer authoritative treatments of 
of smoking and human health? 

In some general sort of way, yes. 

Meaning there may be some specifics in 
irgeon General's reports that you might 
with but at least in a general sense they 
table information for the general public 


and for medical practitioners and others as to 


smoki 


d human health. Fair enough? 
General guidance on what at some 


prima facie level the evidence appears to be. 

Q. Do you subscribe to any medical journals 
or epidemiological journals? 

A. No, I don't subscribe to any. 


MR. BIERSTEKER: Mike, why don't we 


take that break now. Okay? 


MR. WITHEY: Okay. And we'll go to 
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12:30. Is that agreed? 

MR. BIERSTEKER: You want to break 

until 12:30? Or break briefly now and continue 


until 12:30? 


MR. WITHEY: Break now until 12:30. 




Is t harSasdnouah time? 


MR. BIERSTEKER: Okay, that’s fine. 

(Luncheon recess at 11:52 a.m.) 
AFTERNOON SESSION 


12:40 p.m. 


MrINWtHEY: 

Sa a. 


the e 


cum 



or pu 



I Back on the record. Doctor, in one of 

Hits, I think it's 1, you have attached a 

* 

fm vitae. Is that correct? 

| Yes, in one of them I did. 

| Are there any titles of either the books 
fations that contain within it health care 


expenditures ? 

A. In the titles of what? 

Q. In the title. Is the term health care 
expenditures contained within the titles of any of 
your publications? 

A. I don’t believe so. 

Q . Was it a subject matter of any of the 
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publications that you have identified on your C.V., 
the topic of health care expenditures? 

A. Expenditures per se? Probably not. 

Q. Did you discuss with was it Mr. Schafer 
or Professor Schafer doing the multiple imputation 
analy Have you discussed with him any 

preliirina&ry results? 

AT™j I have not discussed any preliminary 

re8 ul^H 

k i 

Have you discussed with him when he is 
.ng f^or-'iget his work done, if at all? 

1 discussed with him fairly briefly 
probaB^n^hree or four weeks ago the state of his 
work project; and at that time it became 

k_ 

clear that he probably would not have the 

time, assurances, not assurances, 

anticffliPlflLon otherwise that he would be able to. 

And so that's why it was decided we had to change 
gears in order to try to make the multiply imputed 
dataset available in a more timely way. 

Q. Do you have any correspondence or 
documents to or from him related to this project? 

A. No. Just one, well, it may be two phone 
conversations. 
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Q. No documents? 

A. No. I have not received any documents. 
Q. Have you sent any documents? 

A . No . 


about 


cited 


Q. When we broke for lunch we were talking 
Surgeon General's reports. You have 
p aM imber of the Surgeon General's reports in 


perg e 6 


your ownjjaper*. Is that correct? 

bemk mk In my statements, you mean? 


Yes, your reports. 

I have cited in a couple places some 
elieve 

Would you want to consult with either a 
ctor or an epidemiologist to determine 
of risk factors for disease might be a 
g variable in determining whether smoking 
an health effects? 


medic 


what 


conf o 


cause 



A. Yes. 

Q. And would you want to consult with such 
doctor or epidemiologist to determine whether the 
effect of exposure to more than one risk factor 
presents a synergistic as opposed to an additive 
effect ? 


A. That would be reasonable. 
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Q. Do you know what synergism is as it 
applies to disease? 

A. It is an interaction, sure. 

Q. Do you understand the concept of 
attributable risk? 


I believe so. 


of 19 


risk 


‘tLgvJ~5n^i 




writt 



You've read the Surgeon General's report 
hapter 3, that describes the attributable 


Have I read the whole chapter? No, I’ve 
of it. 

Do you know who wrote that chapter? 

It's possible that was the chapter that's 
Jeffrey Harris because I know he did 


writers chapter, but I don't know the chapter 


numbe 


Do you have any basis to disagree with 


the discussion of attributable risk in that Surgeon 
General's report? 

MR. BIERSTEKER: I am going to object 

to that, Mike. I mean, if you have something 
specific you want him to look at, let him look at 


BY MR. WITHEY: 
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Q. Go ahead. 

A. I don't know what specifically you are 
referring to in that chapter. As I said, I’ve not 
really read the whole chapter, I've read parts of 
it. But I would be happy to comment on something 


if yo 


f orxnu 





e a specific question about it. 

Do you know what the attributable risk 


MR. BIERSTEKER: Objection, asked and 


answe 


2l 


Yes . 


Can you state it then? 


Do yo 




I can state it in its most general form. 


an equation? 


If you know? whatever you know about it. 
Eithei bj«ri equation or stating it in general form. 

It is the same thing that is in my 
document that I have attached that I call SAF Q , and 
that's the general expression for when there are 
different kinds of smoking behaviors called SAF^. 

So I can just read it to you, if you would like me 


Q. I don't want you to read your report to 
me. You can just refer to it. 
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expires si 


Okay. So I basically derived that 
on from first principles under explicitly 


stated assumptions. 

Q. Do you cite the Surgeon General's report 
in that context? 


ihwN 


o, I don't 


accep 



of t h 


Oh 


attril 


The attributable risk formula is well- 
n the field? 

MR. BIERSTEKER: Object to the form 

istion . 

I don't know what "accepted" means. 

Let me clarify it then. Is the 
ile risk formula as stated in the Surgeon 


Gener^-s| report a generally accepted methodology 
for d^^^ining what portion of the percentage of 
di sean a given population may be attributable 

^ 1 J 

to a gHWHjicular risk? 

A. With a "may be." It is used for that 


purpose 


And it is accepted in the field of public 


health for that purpose as reflected in the fact 
that it is in the Surgeon General’s report. Fair 
enough? 

A. It is accepted for that purpose of 
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23 
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indicating what may be the attributable cause. 

Q. Do you have any expertise which would 
allow you then to disagree or critique that 
attributable risk formula? 

A. Yes. 


pN 


What is your criticism of the formula? 



It's in that report which lays out 
at assumptions have to be made under 
formula is appropriate, assuming you 
dequate data in the actual world to 
11 the pieces of actual-world 
ons . 


has 



Have you ever performed an analysis that 
d a smoking-attributable fraction? 

Have I ever conducted — ? 

HR. BIERSTEKER: I'm going to object 


to thffRrm as vague. 

A. I’m not sure. For smoking-attributable 
- fraction, perhaps not, but I may have. 

Q. Can't think of one now? 

A. Correct. 

Q. In Minnesota — and I can cite it if you 
need it — you've testified you had extremely 
limited familiarity with the topic of health care 
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costs of smoking. Is that your recollection? 

A. I don't have a specific recollection on 
"very limited," but if you could show me the 
context it would help. 


V • 


Well, do you recall that? 
Very limited? 


H Yes. 

I don't recall saying that exactly. But 
if yodCs$iy>w it to me, then I will certainly read 
it. yjjP^ld have to see it to be sure. It would 
l^^hel^^^ft to see the context. 

> pm®. Let me show you this transcript at page 

350 tctel if this refreshes your recollection 


about 


you a 


again 


question and answer. I'll read'it to 
en show you the transcript. This is 
Hamlin on cross: 

"Now, you have extremely limited 


familiarity with the topic of the health care costs 
of smoking. Correct?” 

And your answer, "Correct." 

"Question: And you've not consulted 

with anyone about that topic. Correct?" 

And your answer, "Correct." 

Here, let me show you. 
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x * 






t app ears to b< 

. rl .» 

ic]pH|ranscri] 



sa»s- 


A. All right, let me see. The way you 

stated it is, you stated his question as my answer 
I didn’t remember saying that. Didn't sound to me 
like something I would say. 

Q. Well, you agreed with it. Correct? 

I agreed with the question, yes. 

All right. Well, was that true? 

Isn't that different from saying that 
I sait LJLfe same thing? Maybe not. 

pswstssssss 

MR. BIERSTEKER; Let me ask, since 
^rs to be something other than the 

pt and you’ve been referring to 
page n^^ers here, do you think it might be 

o get a copy of the version that you are 
just make it an exhibit to the deposition 
all over? 

MR. WITBEY; Absolutely, sure. I’ll 
even leave it here with the court reporter and he 
-can -- In fact, let’s just go ahead and mark this. 

MR. BIERSTEKER: That’s fine. 

Thanks. 

(Rubin Deposition Exhibit 5 marked 
for identification.) 

BY MR. WITHEY: 
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Q. Doctor, I am going to hand you what's 
been marked as Exhibit 5 in your deposition, which 
is the unofficial transcript of the testimony you 
gave in Minnesota at trial which we previously 
referred to in this deposition. But since counsel 
has a 8 ^j it be marked as an exhibit, we have no 
proble m » p.th that. And let me just indicate that 
Exhiband 4 are the two Sommers' articles that 
you . Biersteker will provide to us and the 

court |iap|>rter for inclusion in this deposition. 


r ? 

if n 

the q 


t rans 


quest 



Feel free to read before and after 

eed be, but did I correctly read 
ons and answers contained within this 
? 

"Correctly”? You correctly read — 

The question is, did I correctly read the 
and answers? That's my question. 

Well, I certainly remember that you 
correctly read the answers correctly, correct. And 
X don't have that good a memory of exactly what you 
read. 1 presume you did. I don't have any doubt. 

Q. And at the time you understood you were 
under oath in Minnesota? 

A. Yes. 


A. 
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Q. Did you answer that question I read 
truthfully, sir? 

A. Yes. Now may I take a minute to read 
before and after, before we continue? 

Q. Sure. 



under 


aster 


I und 


fSggggg, 

sTfuiu E> e 

or el 
back. 


A Thank you. (Pause) Okay, I have a 

contest."'* May I ask a question? Because I don’t 

LJZJ 

under^Hd something in this copy. There's an 
aster^^24 before the question. Just so 
I undM^and the context better. 

1 think that refers to some other page 
fflAbepilll||That ’ s the way it's usually done on e-mail 
or electronic format. Thank you. Let me have that 
back.PNve you done reading? 
a 71 Yes . 

a Thank you. 

On page 9 of your supplemental report 
which is attached as Exhibit 2, you refer in 
footnote 5 on that page to Dr. Dement's estimate of 
relative mortality risk of smokers. Do you see 


that? 


A. No, I'm afraid I don't. 

Q. Page 9 of the supplement. Exhibit 2. 

A. Supplemental report and I'm looking at 


JONES FRITZ & SHEEHAN 


tp://legacy.library.ucsf.ecfii^link^jQod|®fpaO<ZXAp«Kfv.industrydocuments. ucsf.edu/docs/xygl0001 


51951 9472 







133 





Donald B. Rubin, Ph.D. 

page 9 and I don't see a footnote here. Am 
I looking at the wrong thing? Oh, here it is. I'm 
sorry. I was looking further along. Okay, I see 
the footnote. 

Q. The last sentence of that footnote says 
conduct propensity score analyses to 
size of the differences between these 
lb and the American Cancer Society 
Correct ? 

That's what it says, yes. 

Have you done such analyses? 

No, I have not yet. 

Do you plan on doing it? 

When time permits, I believe it probably 
fa useful idea. 

When did you first discuss with 
Iteker or any other defense counsel the 
fact you wanted to do or were planning on doing 
propensity score analyses on this topic? 

A. On the ACS or generally? 

Q . Yes. 

A. On the ACS? Probably about the time 
I was finishing up this report. 

Q. Did he tell you when that work should be 



would 
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completed by? 

A. No. I don't remember in fact having 
further conversations about that. 

Q. Did he tell you that you had to complete 
all of whatever additional work you were doing one 
week bj^^re the deposition so you could give it to 
:d“^he tell you that? 

He said those were the deadlines that 
»sed to be tried to adhere to. 

Have you relied on a Harvard Ph.D. thesis 
your work in either this case or the 
Laborers case? 

I think I may have referred to a thesis 
that [ust completed last year by Leonard 

Rosema n. JI believe I mentioned that. I'm not 


us? 


were 



an 


1 jffflfth 




o 

positibpd 

H< 


low do you spell that name? 

A. R- o-s-e—m-a-n. 

Q. What is that thesis? 

A. That was a thesis on looking at survival 
analysis models and observational studies where the 
methods for controlling for confounding variables 
were either model-based, like Cox proportional 
hazards modeling’, or propensity score methods, or a 
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combination, and analogous other work that had been 
done previously in observational studies. And it 
basically showed that the modeling by itself can be 
severely biased in certain situations, and the 
combination of propensity scoring methods and 
model .s clearly superior to either method 



alone 


6 rep 



Har di 

it . 
that . 



ecially to the modeling alone. 

Have you cited it in either your November 
r your supplemental report? 

I don’t believe so. And I may be wrong 
ng specified that as a document, but 
did for some case. 

Did you ever provide that to B a rbaar A" 
she could send it to us? 

I believe I made a copy of it and sent 
again, my memory is a little vague on 


Q. Other than what we've identified, do you 
• intend on performing any more or any additional 
work in preparation for your — Strike that. Did 
you want to say something. Professor? 

A. Yes, I do want to supplement my last 


answer. 

Q. Go ahead. 
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A. X just want to be accurate, there was a 
paper that was part of a thesis that was done in 
the department of economics that I believe was 
submitted back even in the early days of my 
involvement, which is probably in Mississippi, 
which! isSaai by two guys — and I'm going to get the 
ng — “Taut Dehesha, D-e —h something, and 
ze to him, and Wabha, I think W-a-b-h-a, 
Ph.D. students in the department of 
. In economics you can submit joint 
part of a Ph.D. thesis. And this was a 
that, as I say, had been undergoing peer 
d I think it's supposed to appear in the 
f the*.American Statistical Association. 


names 
I apo 
who w 
econo 

aer 






sum 
r evie 
Journ 




I beli ieva : I'm not positive. But the early version 
of thtatemlwhich was part of the thesis, was 

submi|W*Pl| - 

Q. I’m sorry, but we’ve got limited time 
and -- 

A. You asked the question; I’m trying to 
answer it. 

Q. I didn’t ask you for the full history of 
it. I just asked you if there was any other 
document. Have you relied upon that thesis? 
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A. Yes. It was in reliance on — That 
piece of the thesis I relied on. 

I was trying to reconstruct the names 
of the authors and what it was and why it was not a 
complete thesis. I wanted to give an accurate 


answe 




hat's all I'm trying to do. I'm not 


tryin y t3 5| pause, I'm not trying to wait, I'm trying 


to gi 


to yo 




bache 


wr it t 


and I 


years 



>u accurate answers. 

Let me ask you to direct your attention 
:hibit 1. I have some questions on this, 
ill, you're not a behavioralist. Correct? 

>t trained in human psychology. Correct? 

m 

Not much. Although it turns out that my 
; degree is in psychology and I have 
my papers with behavioral psychologists 
s conducted a consulting lunch for fifteen 
larvard in the department of psychology on 


using statistics in behavioral work. I've worked 
with people in schizophrenia. 

Q. Your one-component and two-component 
behavioral model, as you call it in your paper, 
looks at individual-initiated changes in smoking 
behavior and trust-initiated changes on smoking 
behavior. Is that correct? 
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>% 

ip 





I«f4 
r«s, 

pg gg|gjg| 


A. That's correct. 

Q. As I understand it, then, you have not 
assessed the impact of tobacco industry changes in 
smoking behavior. Is that correct? 


f orm 


<^4 


MR. BIERSTEKER: Objection to the 


descr 


I hav 


m 




ansve 




e question. 

Correct, I have not assessed. I have 
generally how one would do that but 
done the assessment myself. 

Have you reviewed any documents or any 
at have looked at what impact the tobacco 
as had on smoking consumption? 

Outside of the pieces -- ? 

MR. BIERSTEKER: Object to the form. 
MR. WITHE?: Let him finish his 


an objection. 


MR. BIERSTEKER: I'm just interposing 


MR. WITHEY: Not in the middle of an 


answer. Okay, go ahead, go ahead. 

MR. BIERSTEKER: Maybe, Doctor, you 

should wait thirty seconds to enable me to object 
to the question before you start your answer. 
Would that be preferable, Mr. Withey? 
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MR. WITHEY; No, thirty seconds isn't 


needed. 


MR. BIERSTEKER: Thank you. 

MR. WITHEY: You can form an 

objection within five seconds, counsel. 


BY MR 




HEY : 




! Go ahead, Professor. 

Could you restate the question? 


Have you read any research papers. 


of whi 


SurgepW^eneral's reports, articles that describe 

that tobacco industry conduct has had on 
WLkipp||revalence and consumption? 

MR. BIERSTEKER: Object to form. 

I certainly have read a variety of 
documpyyp| in the context of this litigation, some 
of wh|fe|^are reports from plaintiffs, some of which 
are rifilts from defendants, and others of which 
are just reports, documents that had been produced 
- for me to look at, such as the Surgeon General's 
report which I read parts of only. 

Q. Have you read Chapters 7 and 8 of the 
Surgeon General’s report of 1989? 

A. Certainly not in their entirety. But it 
is possible that I have read parts of them. I just 
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don't know. 

Q. Do you know what topics are dealt with in 
those chapters? 

A. No, I do not. 

Let me just read you the titles of those 
Lnd see if that refreshes your 
.on about whether you've read what the 
sneral has to say. Chapter 7 is 
■ - and I don't have a copy of it here but 
what the title is -- Smoking Control 
ind Chapter 8 is Changes In The Smoking 
Environment, Behavioral And Health 
:es. Have you read either of those 
Professor? 

I certainly have not read either of those 
|in their entirety. It is quite possible 
parts of both of those chapters, 
particularly the second one. I think I may have 
read parts of it. 

Q. Let me ask you, have you ever seen what 
is Figure 3 within that Chapter 8, which is a chart 
that shows adult per capita consumption and major 
smoking and health events? And let me just show it 
to you and see if you recall that. 
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A. I believe I have seen this, although 
I don't remember seeing whatever is on the right- 
hand side of the chart, I don't know if that's an 
addition or what. But I believe I've seen 
something like this. 

Have you ever read any of Ken Warner's 
work tohats; looks at the issue of consumption trends 


in thJ 


eventi 


jjited States and the impact of certain 
pluding tobacco industry conduct? 

I I don't believe so. 


indu s 




hi st o 
influ 


And you've already testified that you are 
Irtrl aj udo i^ert in the history of the tobacco 
indusIS^p Correct? 

Correct. 

Mor, I assume, are you an expert in the 
h i s t o the tobacco industry's efforts to 

i. 1 | 

influffW^ consumption of cigarettes. Fair enough? 

A. That’s correct. 

Q. Do you recall in your review of Chapters 
7 and 8 of the Surgeon General's report whether you 
read any statements made in those reports about 
whether it is possible for historians or public 
health officials to isolate out specific events 
including tobacco industry conduct or reaction to 
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>% 

pQ 1 

12 

0 >< 

tj 

'W.S 

P §#6 


Os 

& 


that as it relates to consumption of cigarettes as 
opposed to looking at the aggregate effects of a 
series of events and conduct? Do you recall any of 
that topic being discussed? 


form 


I can 


speci 
that. 


uSst 


CM 


of th 



MR. BIERSTEKER: Objection to the 

e question. 

That's a long question, but I believe 
er it because you asked do I have a 
ecollection of reading discussion of 
that the question, the gist of the 


Yes . 

And I do not have a specific recollection 
Even though if I did, I might not know 


how tg aryswer the question nevertheless because it 


wa s 1 


nd complicated, but.... 

Would it be something that historians and 


public health economists would have a better 
judgment than you would on whether it is 
appropriate to look at the aggregate effects of 
conduct over a series of years in assessing their 
effect on consumption curves as opposed to a 
particular statement at a particular time by the 
tobacco industry? 
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MR. BIERSTEKER; Object to the form 
of the question. It's compound, it’s vague. 

THE WITNESS: Could we have it read 

back, please? 

(The reporter read the question.) 
ik Z'Smd Not necessarily. 

Q>% ^ Has the tobacco industry influenced the 

consumpTTpn of cigarettes in this country over the 
last ^^^|^-five years? 

I am not an expert in that, but as a 
person^ I would expect that they have. 

How have they done so? 

Well, presuming they spend lots of money 

on adl'e-aMsfi.sing and presuming they think and many 

. 

other s, peafo le think, advertising people think that 
fl dver^pj^ig is an effective way of switching brand 
loyalW^ or encouraging perhaps more smoking or 
different brands of smoking. Presumably that's why 
they do that. 

Q. Any other ways the tobacco industry has 
affected the level of consumption of cigarettes 
that you can think of? 

A. Well, there are lots of allegations in 
the documents that have been floating around these 
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^9 

ft 


cases that certainly claim that there are other 
ways in which they have influenced it. 

Q . Such as? 

A. Such as withholding information about the 

dangers of smoking; such as not bringing to market 


safer 




ucts and thereby altering the mix of 


people-sHttoking safer cigarettes, if such things 


exist® 


adver 


^PWlrt i 



been 





sugge 


jd current cigarettes. Those are two main 
come to mind. X guess also the kind of 
ng, there's been claims of targeting women 
ing particular subgroups of people with 
r kinds of advertising. There may have 
gations to that. 

Like teenagers? 

I There have been allegations to that, yes. 

Anything else you can think of that 
the belief that the tobacco industry has 


influenced how many cigarettes have been consumed 
in this country over the last forty-five years? 

A. Well, I don't know if this is separate 
from the previous ones or not, whether you regard 
it as separate or not, but certainly Jeffrey Harris 
has written substantially about the conspiracy, the 
antitrust conspiracy that would according to him 
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have influenced the coneumption of cigarettes of 
different types and amounts in the United States. 

Q. Do you have an opinion — Let me ask one 
other question. Do you believe that smoking 
cigarettes can lead to an addiction to nicotine? 

MR. BIERSTEKER: Object to the form 

of th or qtl^e s tion . 

I r T 

I am not an expert in that at all. My 
1 avma AiJrib nderstanding of nicotine is that for some 
peoplfPMotine can be an addictive drug and that 
^ ar M 1 ^ contain more or less levels of nicotine. 

Ip^Have you made any attempt to determine 
the im^sict on smoking consumption if the tobacco 
indus■|s'ry' m 4iad stated that nicotine is an addictive 
substanpeJand it is contained within our tobacco 
produ | Ml ^ t any particular point in time? 

MR. BIERSTEKER.: Object to form. 

A. I have not made a study of that 
- particular question. But this general framework 
that I'm laying out statistically shows what has to 
be done with data and the kind of assumptions that 
have to be made to address that kind of a 
question. So specifically, no, but in general 
I have. 


(n 

M 

VO 

Cn 

O'! 

VO 

4^ 

QO 

Ln 
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Q. Do you believe "that the tobacco 
industry's statements denying that its product was 
addictive had any impact on consumption of 
cigarette s ? 

A. Will you read that back? I'm sorry. I'm 




M Do you believe that the tobacco 

statements denying the addictive 


qualU 
produi 
Igiylnt j 


'W ’ 

O. 


that. 



of its product and nicotine in its 
is had any impact on consumption in this 


I have only a layperson's opinion on 


What is that? 

It's kind of mixed because it certainly 
.e that it has had an effect on some 


x s po 


peopll7®mthough there seems to be other evidence 
that people were generally aware of that for many, 
many years and they still chose to start smoking or 
smoke more heavily. 

Q. Have you considered in what you call your 
two-component behavioral model the tobacco 
industry-generated changes in a counterfactual 
world other than — 
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A. It might be helpful if I clarify my last 
answer. 

Q. Let me finish my question and then you 
can clarify. — other than defendants withhold 
information on health risks? 

Can I have the question read back. 




please*? • m sorry. 

LU 

M ™ ' Have you considered any other tobacco 
nitiated changes in conduct in your 
ent behavioral model other than, quote. 


indus 
two-c 
if e 


of t h 
what 



s withhold information on health risks? 

MR. BIERSTEKER: Object to the form 

stion. I believe it mischaracterizes 
eport says. Go ahead. 


h ~\ 

I think, as the report indicates, that is 
just ample of the kinds of issues that arise 



where PS^FeFfee that kind of alleged misconduct as an 
example. Have I explicitly considered any? 

I could have made examples which are just examples 
like that with any form of the alleged misconduct. 
The point is that the general framework that I am 
formulating can be applied to any set of alleged 
acts of misconduct. 

Q. But it is your understanding that — I*m 
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sorry. 

A. Can I clarify my previous answer? 

I thought that was — 

Q. Go ahead. 

A. I think that was with respect to the 
| point of view. It is not only from 
nd things but from knowing people who 
d for a long period of time, that some 
addicted to the nicotine and perhaps 
n't. My father was a very heavy smoker 
k twenty years of his life, stopped 
forty and never smoked again, with no 
He stopped. And he seems to be very 
eighty-something now, whatever he is. 

So I'm sure it varies by people. But 
the f hat nicotine appears to be addictive to 

some le I think has been well-known for a long 


<jn 

tn 

cn 

CO 

>C 

00 

CO 



time . 


Is heroin addictive to some people but 


not others? 


2 1 
22 

23 

24 


A. I’m not an expert on that, again. But in 
fact I think there is some evidence for that, yes. 

I believe there is. 

Q. In your proposed behavioral model you 
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state, and I'm trying to find the exact quote but 
let me just characterize it, you state that it 
would be important to consider the effect on given 
individuals of specific events of misconduct or 

specific events of the defendants' actions as 

S 

distil Ifrom their aggregate effects. is that 
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JrT®j I think you mischar acterized that. Would 
you s^Une what you're referring to? 

i'll try to find it. (Pause) Let me 
to page 7 of your Exhibit 1 under Trust- 

ri , 

,tip®i^ Changes In Smoking Behavior. 

Okay. 

Wr—\ "Suppose again that it were alleged that 
the d eJLejpI dants failed to disseminate information 
about^M| health effects of cigarette smoking in 
1965.F^PVould want to estimate the effect of that 
alleged misconduct on the behavior of each trust 
and then estimate the effect of any modified 
behavior by the trusts due to lack of alleged 
misconduct on the smoking prevalence of the 
individuals who were or would have been the 
recipients of trust-funded health care.” Did 
I read that correctly? 
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Donald B. Rubin, Ph.D. 


A. Yes. 

Q. So what I'm asking you is whether it is 
appropriate as a historian or public health 
economist or health-care expenditure expert to try 
to isolate out the effect of the defendants' 


f ailu 


ef f ec 


judge 


on ex 


indiv 




o disseminate information about the health 
f smoking in a particular year and to 
effect, that misconduct, as you put it, 
trust fund-initiated behavior or on 
1-initiated behavior? 


es I can't answer that question because you 

pMil 

rn j < 

imd health care economist or something else. 


I ' m n 



at, so I can't answer that. 

And you can't answer it because it takes 


some nuddm ent as to whether it is appropriate again 


a s an 


econo 


orian or an economist or health care 
or public health official, whether it is 


8 possible to segregate out the impact of a 


particular act in a particular year of misconduct 
as opposed to aggregate acts going on over many. 


21 many years? 


A. Entirely wrong. And the reason why 
I said I can't is because you said as a person I am 
not, something happens. Now, if you would like to 
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Donald B. Rubin, Ph.D. 


ask the question again without the "as a 
something," I will answer it and the answer will be 
different. 

Q. What research, studies, published 
articles in the field of the effect of the tobacco 


indust 




conduct or misconduct upon individuals. 


let ' s ftaJSe it, have you relied upon in support of 


the si 


to de H 


di s s e 


sCrnrwi J , 
L UTl Q 1 


examp 


scien 


what 


scien 




lent you make that it would be appropriate 
.ne the effect of defendants* failure to 
:e information about the health effects of 
i 1965 on either individuals or the trust 


The sentence says "suppose" so it's an 
The statement is just a statement of the 
method, that if you're looking to see 
ffect of many changes is on a system, the 
way to address that is to try to address 


what the effect of small changes is and how they 
accumulate. That's part of experimental design. 
That's the way you do experiments. It is much 
easier to estimate in general the effect of a small 
change than a massive change on a system. 

Q. No, I'm asking a different question. I'm 
saying not in general, I'm saying in the context of 
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what impact did various acts of misconduct of the 
tobacco industry have on smoking consumption. 


Okay? 


A. In general the — 

Q. Have you found in any text, any peer- 


revie 


condu 


oppos 

long 


resea 





r any Surgeon General's report, that to 
ch an examination as they did in 1965 as 
what the tobacco industry did over a 
d of years is an appropriate scientific 


PWNth 


1 . J 

|Ppin 

nh 

0k 

©4 


ethod? 


MR. BIERSTEKER: I object to the form 

stion, object that it assumes facts not 


in ev^Sce, and object that it mischaracterizes 


what 


JrJTTXXTE 

p 


or an 


unbun 


ssor Rubin is doing here. 

The only statement that I've seen one way 
about bundling all effects together or 
them has been in Dr. Harris' work. 


I haven’t seen anything even addressing whether you 
should do it all bundled together or not. I mean 
all the alleged acts of misconduct. There is a 
general scientific principle which I stated in the 
answer to the previous question, I think, of which 
this is a specific example. 

Q. But Dr. Harris believes it is appropriate 
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Donald B. Rubin, Ph.D. 

to aggregate the combined acts of misconduct of the 
tobacco industry in assessing their effect on 
consumption. Correct? 

A. I don't know if he believes it is 
appropriate. He says -- Well, I guess that’s what 

' K, 

he cIoqmH 
IpFn 

OTyl Right. 

JnU 

■ft. And I believe he says he doesn't know how 
to do pafegsA ny other way. Is that correct? 

S I ' m asking you the other question. 

I'm saying do you know of anybody who 
s appropriate to isolate out a particular 

irticular act of the tobacco industry to 
its effect on consumption? 

J^^Besides me, you're saying? Because 

x — 

Yes, besides you. 

A. Well, I am again not an expert in this, 
but apparently the court does in the sense that it 
says certain things that were alleged like 
antitrust behavior in some cases are off the table 
so you have to address the effects of the acts of 
alleged misconduct that are on the table. So 
apparently it is my lay understanding of what the 
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court ruling has been in this case that they say 

that. 

Q. Other than Judge Gwyn and yourself, does 
anybody else say it is appropriate to look at an 
individual year or an individual act of misconduct 
to as its effect on consumption as opposed to 

1 ookiBcjast the aggregate effect of a series of 
actioffqaken over many years? 

Well, speaking from the general point of 
|, anyone who does scientific work in 
think would agree that when you make 
for justifying the effect of things, it 
to do that for small changes than for 
langes in a system. 

►--\ 

Name one author, one text that has looked 
at from the standpoint of the actions of 

the t<TO«o industry and their effect on 
consumption. Can you think of anybody. Professor? 

A. I have no text in mind that would do 
that. And it's a very — It's a strange question 
to ask, are there texts that provide general advice 
on something and do they have a specific example 
that has arisen in the last few years. No, they 
don't. Not that I know of. Maybe there will be. 
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H a 
s 


kmmmw 


Q. Are you interested in what the Surgeon 
General's report in Chapters 7 and 8 says about 
that question? 

A. Sure I'm interested. 

Q. And if it were to say it would be 
inapprc|M|i-ate to try to isolate out an effect of 


inapp;^^Late to try to isolate out an errect or 

one steatepent made by the tobacco industry on 

f -y | 

consuiPI>n as opposed to looking at the aggregate 


12 

tj 


ef f ec 


defer 




tobacco industry misconduct, would you 
hat authority on that topic? 


tfj jirf 32 ^ Not on that topic, no. 

wni A 

How many years have you been analyzing 
the tMprfco industry misconduct? 

Only since I have been involved in the 


smmtd 

a 

f, 


1 i t i g . 

m How many years has the Surgeon General 
zing that? 

A. I mean, there are different Surgeons 
General and the reports, I understand, are written 
by committees. 

Q. All right. Surgeons General and their 
committees. How long have they been analyzing that 


topic ? 


A. For many years. 
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Q. Do you think they have a better basis of 
determining whether the tobacco industry conduct 
should be looked at in isolation, i.e., one year or 
one act or one statement, as opposed to aggregate, 
than you do? 

Not necessarily. It's a topic about 
scien^if^c method and statistics. 

You've answered the question. Doctor. 

I wasn’t finished with my answer, I don't 
beliefPwlW| I pause five seconds and therefore I'm 

fjfS tepy^ 

you’re trying to justify an answer, not 
give rt-J So if you want to go ahead and justify 
it, glSnsiead. 



m 


on, M 
Dr. H 



MR. BIERSTEKER: Oh, pleasel Come 

We just sat through two days of 


That's being a little unfair. 

THE WITNESS: Can I take a break? 

MR. WITHEY: Let's go off the record. 

(Discussion off the record.) 

(In recess 1:25 p.m. to 1:28 p.m.) 

BY MR. WITHEY: 

Q. Professor, in your previous answer before 
the break you indicated you had some understanding 
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of a ruling the judge has made. Where does that 
understanding come from? 

A. Primarily from Mr. Biersteker. 

Q. Have you read any orders or rulings that 
Judge Gwyn has made? 

It’s possible I have. I'm hesitating 
other cases I certainly have read 
lieces and I may have here or I may not 
|just don't remember precisely. 

Have you read what Judge Gwyn has had to 
Jt’ the defendants' motion to dismiss on the 
|f failure to prove causation in this case? 
I don't believe so. 

■7" 1 Has Mr. Biersteker told you what Judge 
Gwyn -uled? 

a I don't believe so. But he may have. 

n't have a recollection. 

Q. Do you have an understanding as to what 
Judge Gwyn has ruled as to whether trust fund- 
initiated changes are a proper question to ask 
vis-a-vis causation in this case? 

A. I don't believe that I know that, 
although X do seem to remember there was some issue 
about that before these reports were written. 
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I believe. There was some discussion, which 
I don't have a clear recollection of what that 
discussion was. 

Q. Are you interested in what Judge Gwyn had 
to say about that topic? 


^ Nsure I,B intereBted - 

^ Now, let me ask you some questions about 

this sounjterfactual two-component behavioral model 


that 
First 
Edu s 
eTr^ ce 


t^y \ 

□ , 


•re discussed in your report. Exhibit 1. 
iff many different aspects of the tobacco 
statements, conduct, misconduct, * 

, would you include within this model that 


you • v^Kscribed? 


about 


wa s , 


I'm not sure I understand the question. 
Well, you’ve already testified today 
one example you gave in 1965, whenever it 
the industry made statements, I'm reading 


from page 10, "defendants withhold information on 
health risks." How many different kinds of 
conduct, statements, misconduct, whatever you want 
to call it, let's say industry-initiated changes, 
would you consider in the counterfactual world? 

A. For each alleged act of misconduct that’s 
still on the table, 1 would investigate each of 
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those and each of those in combination. 

Q. And how many do you think there are? 

A. X am not an expert in knowing that. 

Q. You would want to know what years those 
allegations or acts of misconduct took place. 


Corre 


pH 


Let me be clear. For each alleged act of 


miscohaucjt or combination that the plaintiffs think 


are r 


count 




w ' 


ant, for each of those there is a 
ctual world. So for each one that the 
, not me, but the plaintiff brings forward 


nation of alleged acts of misconduct 


there’ffiuUd have to be an analysis like this. 


time 


up un 


And those would have occurred between the 
i alleged the conspiracy started, in 1954, 
.iterally the present. Fair enough? 

If these alleged acts of misconduct 


continue to the present, then that's correct. 

Q. And there would be a number of different 
acts of misconduct that we have stated in our 
complaint for almost every year of the existence of 
this conspiracy. Correct? 

A. Correct. 


And so you would want to look at each of 
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Donald B. Rubin, Ph.D. 

those. Fair enough? 

A. I would want to look at each of those, 
because in order to justify or to rationalize what 
effect that alleged misconduct would have on the 
prevalence of smoking behaviors you have to know 
what tfefltes£ act was and what the logical argument is. 
BecauMjie don't- have, we can't go back in time and 
rerunfrli world to see what would have happened. 

So it LiJ Lja logical argument that has to be made; 
and you’re going to make the argument, you 

, nz] 

tr®™make the argument. They can't just throw 

P™™3 

1 Igpi th yfejh ands and look in the sky and make up a 
numbe: have anyone believe it. 

And that's what Dr. Harris has done, 
then, t. thro wn up his hands and made up a number. 

Corre^d 

?. 1 j 

ipP®l He has basically said "I can't 
disaggregate it, that’s too hard. Instead, I'll 
think about it and I'll pull some numbers out" 
and.... He has a tremendous amount of 
information. I'm not criticizing him on that. He 
is an encyclopedia of knowledge on this material. 
I'm saying the analysis is all bundled together and 
it doesn’t have the attributes of a statistically 



&H> rrrff . . 
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U 3 

r a 

^ 5 




m w?m. 


cu 


jTy 

I r ea 

g& 

Per ha 


it, m 


more 

O 



'don 1 t 


ezL* 


valid scientific analysis. 

Q. And so far as you know the Surgeon 
General's Chapters 7 and 8 also bundle the analysis 
together. Correct? 

A. I mean, if you could show me where they 
bundl^^ analysis together. 

^ I’m asking you now. Do they aggregate 
the erfecf|:s as opposed to disaggregate the effects? 
if yo%^k>w. 

MR. BIERSTEKER: Object to the form 

ff ^th ^^ iestion . 

I believe they're kind of addressing a 
diffe^^P question, which is do I know the answer? 

I reafl^rHion■t. I would be happy to glance at it. 

r \ 

Perha ps LJ have read those parts of it, and if I saw 
it, m^lPSli^ it would refresh me and I could answer 
more crolrnectly. 

Q. Sitting here today at your deposition you 
don't know the answer to that. Correct? 

MR. BIERSTEKER; The answer to what? 
Object to the form. 

BY MR. WITHEY: 

Q, Does the Surgeon General's report 
aggregate the effects on consumption or 
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disaggregate them? 

A. I do not know. 


MR. BIERSTEKER: Wait. I object. 


BY MR. WITHEY: 


years 


you w 


trust! 


ar gum 


don ’ t 


Q. If you don’t know, that's fine. 

Then in addition to looking at the 
specip^yj’Sfacts of misconduct alleged for each of the 
yearsH!ha|t the conspiracy was believed to exist, 
you wU want to assess the impact of those on the 
trustp^lpffifds • behavior and the individual smoker's 
the beneficiary's behavior. Correct? 
Correct, assess in the sense of making an 
argumS^Jwhy it would be this way. Because we 
don'tP%ir^e data, direct data on that. 

When you postulate that in 1965 
defen|Sp ^ B withheld information on health risks, 

lIj 

postuRW it aa part of the method, I mean, would 
you also want to know in asking the questions that 
you would ask in this two-component behavioral 
model what was the nature of the information 
withheld and would it have made any difference to 
the individual? 

A. If the claim is that withholding 
information had an effect on smoking prevalence 
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CD 



behavior, then it would be useful to know what that 
claimed withheld evidence is. Otherwise I don't 
know how to scientifically evaluate the claim. 

Q. And that would require you to have 
knowledge about what was in the internal files and 


docum 


the i 




enoug 


of the tobacco industry to find out what 
ation was that was withheld. Fair 


cc 


becau 


I don't know if it would require that 
' m not familiar with what’s public and 


ITS 

mm 

Q* 


||La t 1 sZB ot public. But it might be helpful; it 
ht pH^reguired. I don't know for sure. 


not s 


indus 



And of course that's something you have 
d, the internal documents of the tobacco 


Correct. 

fe—L-j 

How many beneficiaries would have to be 
interviewed under your two-component behavioral 


mode1? 


A. Would have to be interviewed? 

Q. Yes. I assume you have to ask some 


people these questions. Right? 

A. At some point one would have to 
collect — not have to — it would be highly useful 
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to collect data that would help inform these 
counterfactual prevalences of smoking behaviors. 

And you could certainly interview people to help 
get that information. 

Q. How many would you have to interview to 
have aj ^la ^lid study. Professor? 

B r* " "l It depends completely on what kind of 
questfons| the interviews are asking, what the 
purpo aafefeisfafe . It's — Z can't — 

a Let me read from your report and see if 
........... us this question because apparently 

yWrfW^ing some problem with the way it's 
formuS^a; and I appreciate that. You state "This 
model R^the effect of the changed smoking behavior 
on he^^care expenditures of the trusts' members 

would have to consider characteristics of the 

fliLiuJjiui uni 

indiv^PflnLs who were or would have been recipients 
of trust—funded health care. These characteristics 
would include year of birth, sex, race, income 
level, education, baseline mental and physical 
health, and other confounding factors, such as 
health-related behaviors, that may be important 
predicters of how smoking behavior affects the 
medical costs of the trusts' members." That’s 
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reading from pages 7 and 8 of Exhibit 1. You wrote 
that, right? 

A. Well, let me find it first. So you're 
reading the bottom paragraph on page 7? 


Q - I read the entire bottom paragraph on 


page 




d going on to the first line of page 8. 


be f or 



A”. ^ Yes. 

LJJ 

And it would be ideal that "each 'such 
chara yM istic would be measured on each individual 
beforpMKje moment in time when the alleged 
WcoifWt had an effect.” Correct? 
jra g Correct. 

Now, I guess the question that I have -- 
fP g ~" H I'm finishing the sentence. You stopped 
in the mij ddle of the sentence. So it is probably 
use f u fe™if you’re going to put it in the record, to 


On 


usef u 


read 


I^P|»hole sentence and not stop halfway. 

Q. "Had an effect on that characteristic for 


that individual.” 


A. Correct. 


Q. So the question is: I assume then in 

what you're calling the model that you are 
suggesting would be an appropriate model to use, 
you would have to do what you just said, consider 
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the characteristics of the individuals who were 
recipients of health care. Correct? 

A. Correct. 

Q. And the question I have is; How many 
individuals or beneficiaries would it require 
someo p^ o interview in order to gather those 
charar teM stics from each of them to have a valid 
surveyoE this beneficiary population? 

jte jW For this part, just to be clear, for this 
part |PiP§g|t what we're trying to do is understand 
Ippertinent population how smoking 
fflvafKIe in the actual world varies across these 
diffejnssuF categories of people; . and then in this 
eountirralctual world we have to estimate or posit 
what thopfe prevalences of smoking would be without 
that |H^ged misconduct. Okay? 

F™™™l Now, the first part of the survey is 

just to get the smoking prevalences in the relevant 
population in the actual world. The second part 
would be to try to get some feeling of, without 
those alleged acts of misconduct, how the 
prevalence would have changed. So there are two 
different activities. Okay? 

Q. Now answer my question about how many 
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people you believe would need to be interviewed. 

Are you saying you don't have to interview anybody? 
A. Of course not* 

Q. How many people would you interview. 


Professor? 

K<> 

It depends upon how much detail you 
actuajH-y^ant to include in behaviors in the model 
ch it appears that actual-world smoking 
change by those characteristics and, 
by what plaintiffs claim without specific 
misconduct that smoking behavior would 

or example, if their claim is everybody 
cHke half as much in some sense as they do 
to determine whether it's a half or 45 
perce nt do esn't require an enormous supplemental 
surve^lp^ ask people what they would have done had 
that smoking was bad for you in some 



they 
sense. 


Known 


Q. I'm asking how would you do it. You 
state in your report that you would estimate, and 
let me just read from page 5 at the top of the 
page, "For each distinct type of smoking behavior 
that is considered relevant to health care costs, 
I would estimate its prevalence in the factual 


^£> 

Cn 
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world and in the counter£ actual world. 


A. Correct. 


Q. Did I read that correctly? 


A. Yes, you did. 


Q. So in order to estimate its prevalence in 


the c. 




erfactual world, wouldn't you have to 


interwielr some beneficiaries? 


goin 


You certainly would have to if you’re 


estimate from data, yes. But let's 


supp< 


hat, for example, the plaintiffs* claim. 


» ir^JTev of what happens to smoking prevalence in 
cPWferfactual world, we want to know what that 


ef f ec 


woul 



on health care expenditures, how much 


ever 


e been saved. If their claim is that f< 


no matter what type of person you are. 


smok 


ntensity goes down by a factor of two and 


ve to estimate is whether it's a factor of 


two or a factor of three or a factor of four or a 


factor of 1-3, then you don't need a very big 


survey to estimate one number. It's a fraction. 


Q. How big? 


A. To estimate a fraction? If that's what 


their claim is, that everybody, no matter what your 
background characteristics are, it goes down by a 
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factor of some unknown factor, and you want to 
estimate the factor, then a few hundred would do. 

Q. A few hundred beneficiaries? 

A. If that is the claim and that’s the claim 
you wanted to address. If they claim that it goes 
down half and we don’t need any data because 

own by a half because I thought about it 
t and I had a dream and somebody came to 
Id me it goes down by a factor of a half 
body, then we don’t need any data to 
at claim other than the data in the actual 


it go 
last 
me an 
for e 




passes. 

m 

nrftxt 

No, I’m asking you because you stated in 
you would estimate the prevalence in the 
►rid and the counterfactual world. I want 
|hat you would do to either support the 
Is that the tobacco industry misconduct had 
an effect or that it didn't have an effect. In 
other words, you’re the one who is designing this 
model. Obviously the plaintiffs' experts haven’t 
said this is the model that's needed; you're the 
one who says it’s needed. I want you to describe, 
if you would, the model that you believe is 
necessary in order to prove whether there is or is 
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not an effect of the tobacco industry on 
consumption or prevalence. How many people would 
you have to interview for your purposes of 
estimating, as you put it, the prevalence in the 
counterfactual world? 


I^M 


I am trying to answer that. I'm not sure 


I ' m bein'® - 


| Go ahead. 

| I*m not sure I'm being clear, so I will 


try a 


a. 


count 


al leg 



The last question asked just with 
estimating the prevalence in the 
tual world. Now, there is a claimed 
t of misconduct by the plaintiffs. And 


let's ^Bupja ose the plaintiffs claim that because of 


that 


every 


ed act of misconduct, smoking for 
went down by 50 percent. Then 


non-smokers didn't smoke and everybody who smoked 
would smoke half as much. If that's the claim and 
we want to assess the extra medical expenditures 
due to that alleged misconduct under the assumption 
that Bmoking will go down by a factor of two 
without the alleged act of misconduct, we don't 
have to collect any data to estimate that because 
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they are claiming it's a half. It's just a number 
and we can put it in the right formulas and get out 
the answers using only factual-world data. 

Q. So your testimony is you actually don't 
have to interview anybody, any beneficiary, to 


deter 


smokigg 8fonsumption would have gone down by a 
factofo^ two, in other words, by half? 
taM Absolutely not. 

S pl’m trying to get your -- You’ve said 
various times — 


isyyp$f 


or to support the conclusion that the 


May I finish my answer? 

Go ahead. I thought you were done. 

I didn't say that at all. I'm saying if 
to assess the excess medical expenditurei 
assertion, under the claim that without 
d acts of misconduct smoking would go 




one w 


under 


the a 



down by half, if you wanted to do that -- I'm not 
saying that's the right thing to do -- but if you 
wanted to assess just that claim, that is, that it 
would be a half, we don’t need data to estimate 
that half. Somebody is asserting that it is and we 
can see what it would be. 

Now, in order to -- May I continue? 
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Q. Go ahead. 

A. In order to try to assess whether that 
claim has any validity or not we can say, well, 
let's suppose without the alleged act of misconduct 
it would go down by some constant percentage for 
every^|^. So let’s assume that* Is that constant 
percen t a“ ^e a half? Does it appear like half is a 
sr? Which supposedly the plaintiffs 
len you could do a survey to say, okay, if 
constant number by which everyone is 
, let's do a survey of a few hundred 
ask them, smokers who are in the right 
;ion, ask them: If you had known this, 

.ess smoking would you do? And then you 
up over all the people in the survey and 
twer. 

If you want one number, you don’t 
need that many people if that's what the claim is. 

• Is that helpful? 

Q. No. You're asking me; it’s not. But 
you’re the one that's proposing it. 

Let me juBt put it this way: What do 
you understand to be the estimate of the reduction 
of smoking prevalence in the beneficiary population 
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or expenditures in the beneficiary population that 
Dr. Harris makes? 

A. Sorry. Say that again. 

Q. What do you understand to be the 
percentage reduction of smoking prevalence in the 
benef ry population or expenditures of the trust 

funds p ^t h ist Dr. Harris has concluded as a result of 
the industry's conspiracy and misconduct? 

I think for expenditures, I believe it *s 
rough ff|^ half. 

And there is a range that he gives. 

Yeah, but.... Yes. 



H And what you’re saying is in order to 
P rov ^iW^ * you would have to construct this model. 
Corre|H^ 

I“i Not prove that. We're talking about 
prevalence, prevalence in the counterfactual 
world. He has claimed prevalences in the 
counterfactual world which do not vary by 
background characteristics whatsoever. 

Q. And I'm asking you, as I understand your 
testimony, what you would do is to construct a 
model that models the effect of the alleged 
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H i 





l 

“ 


C 

j g B g B gg l g l 


^®aw fc w 

^9 

e*. 


misconduct on smoking behavior over time and then 
models the effect of the changed smoking behavior 
on the health care expenditures of the trust 
members. Correct? 

A. Correct. 

Now, let’s assume you are constructing 


that rodia§l now and utilizing that model. Okay? Do 


you u 


istand so far? 


Yes . 



of th 


behav 



Would you have to interview some 
ries in order to construct the model, not 
the model -&■■■ in order to model the effect 
leged misconduct on changed smoking 
? Would you have to interview some 


beneficiajries in order to find the characteristics 


that 


aid are needed on page 7? 

When you are discussing prevalences in a 


counterfactual world, you never get direct evidence 
* on that. The world has already taken place. We 
cannot go back in time and rerun it and see what 
would happen. Therefore, any modeling of these 
counterfactual prevalences without acts of alleged 
misconduct must combine data and assumptions. 

What I was pointing out repeatedly, 
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I believe, is that if the assumptions are st 


r ong 


enough, to assert something you don't need any data 
to see what would happen under those assumptions. 

As the assumptions become more realistic, you now 
have weaker assumptions and you look for data to 


repla 


me of the assumptions. So my job as a 


stati»tio|.an isn't to say exactly what those 


assumi 


what 


assum 


S' 

_ _ /{ft, 

12 ® 

at pa 
0^4 being 

tj 


fxTBtl 

F^™F 

An 


s are, but what modeling has to be done, 
he tasks, given you're making certain ■ ' 
s and want to collect data about other 



t~) 

pmw r 


As I understand it, then, and let's look 
i of your report, you've got a perstm 
Do you see it there? 

The example? Top of page 10, okay. 

You have someone born in 1935. Right? 
Correct. 


Q. Is that ostensibly a beneficiary then? 
A. Pardon me? 

Q. Is that ostensibly a beneficiary? 

A. Yes. 

Q. And you have him start smoking in 1955. 


Correct? 


A. Correct. 
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Q. And then you’ve got one of them, let's 
say the first counterfactual world, number 1, the 
person quits for five years. You see that? 

A. Well, he quits smoking. 

Q. He quits smoking for five years, right. 


Starting in 1965. No? 


years 


No, he doesn't quit smoking for five 
quits smoking, period. 

It says "quit for five years." What does 


t i hea n? Count erf actual world number 1. 


I this 



There's supposed to be a date here, 
u punched out the date. 

What date is it? 


1970 . 


Right 



All right, 1970, quit for five ye« 


A. That's correct. 

Q. Now, how do you find out that 

information? Is that just a hypothetical person or 
do you actually interview the person? 

A. This is a hypothetical example which 
I believe is operating at the level of an 
individual to make the modeling points. But, yes. 
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pisilii? 



U 




in general what, you would like to do is find out if 
the information were disseminated in 1965, what 
would you have done, how much longer — 

Q. How do you find that out? 

A. You have to combine assumptions with data 


or yo 


yH 1 


i just do it all by assumptions 
How would you do it? 




doe sn 


says , 






Jef f r 




I hav 


peopl 


I would want -- I'm the statistician who 
ke the assumptions. I'm the guy who 
, you're going to make those assumptions? 


me tell you how to gather the data and what 
^deling you will have to do - 


Okay. So what would you do? 

Let's suppose that I'm working with 
rris and Jeffrey Harris says, okay, 
s model that says that the percentage of 
quit smoking in a year does not vary 


CL 


with background characteristics of those people at 
all but varies year by year and is a function of 
people's age at that time. Well, under that 
assumption what we have to do is estimate for 
different age categories the percentage of people 
who are in each year who, if the information were 
provided in 1965, would quit. Then we would design 
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a survey of people who are current smokers and 
beneficiaries to ask them, what would you have 


done ? 


Q. And you design surveys. Correct? 

A. I help design surveys, correct. 

For this purpose do you know how many 
IteiJ^rieB are in the population covered by the 


bene f. 


trust Ptunlls in this case? 


All the trust funds? 


rrh 


Yes . 






order 


Hmmm. No, I don’t, actually. 

Would you need to know that number in 
:igure out what an appropriate survey size 


would sample size? 




you d 


Not very much, no. 

So since you don’t know the number and 
need to know the number, how many 


beneficiaries would be within that sample that you 
just described, that survey? 

A. Under this assumption that it is a 
constant number per year, you would like to get, 
oh, a couple hundred people from each of those 


years. 


For each year? 
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j f% 


'«15 

pm 

Os 

Ns 

a, 


A. Or, if it's longitudinal, if you get the 
same people at different points in time. I mean, 
it would be nicer to have them from each year. 


sure . 


Q. Each year meaning which years? 


|pM 


occur 


have 


misco 




The years since the alleged misconduct 
f you knew it was '65, what would you 
in ’65 . 

So if it was in ’54 when the first act of 
. is alleged, then you would have to have 
:ry year since 1954. Correct? 

No. There are various methods for 


smootfflg’ out the time. 


Tell me how many years you would have to 


have it fjpr, Professor. 


per io 


I would like to cover the range of the 
time. But if X’m going to make an 


estimate for 1966, do I have to have data from 1966 
or is somehow data from three years before and 
three years after adequate? If I tell you the 
weather for three days before today and the weather 
for three days after today, do you have any idea 
what the weather was today? You have a fairly good 
idea. 
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Is something funny? 

Q. Yes. It's amazing, actually. But it’s 
just my own — 

A. It’s amazing? 

It's amazing. I'm just astounded. 

What would your astonishment be? 

You’re now constructing the sample size 
and y® ||e saying you would need a couple hundred 

s 

jThat is two or three hundred people for 
Ihe years in which specific acts of 
it are alleged. Correct? 

Not quite. I said if there's one number, 
need a couple hundred people. I said if 
id information from each year, sure, 
ou would have a couple hundred people from 
each to estimating — I'm still talking. 

I'm sorry. Beg your pardon. 

May I finish now? 

Go ahead. 

I also went on to say that you don’t need 




A. 

Q * 

A. 


data from each year in order to make a pretty good 
estimate for each year. 

Q . But that would be each year of the 
alleged misconduct. Correct? 


Cn 
I—» 
«*£> 
Ln 

CD 

Cn 

M 
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A. Starting with the alleged misconduct. 

Q. How long would each interview last? 

A. I guess it depends what questions you’re 
asking them, how many questions. 


Q. You're constructing this model, you're 


dec id 




ased on the misconduct alleged in this 


complsintl what impact it had on prevalence or 
smokiifgcpnsumption, and you're determining which 


years 

people 


'frti 


f gpor 


want to get the data from for how many 
»|ur8uant to this chart that you've put in 
o^rt, you're talking about various people 


events 


surve 


from 


I ' ve 



arious people didn’t quit and various 
urred. How long would you estimate the 
Id take to obtain valid questionnaires 
articipants? 

I think you’re mischaracterizing what 
in the report. And I'm not saying you're 


doing that intentionally but I don’t think you 
understand what a statistician does or what a 
model-building effort is or what designing a survey 


means. 


Q. So you can't answer how long the 
interviews would be? 

A. That’s not what I do. I don't design 
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1,2 




interviews. 

Q. I’m just asking if you know how long an 
interview might be for these questions that you 
think should be asked. 

A. I’m not the person who decides what 
guestio^d should be asked. 

w ^ 

So the answer is "I don’t know” then. 

Right 

%mmmd I don't know without giving me more 
inforp^jn about what questions should be asked 
> n $ w ET the model is, what the specific acts of 
' c o ipls&iNt the plaintiffs are putting forward are 
and w Lssumptions the plaintiffs want to combine 


Cn 

N> 



with <s^af’tnai to estimate what the smoking prevalence 

r—\ 

would ; .be.J .n the counterf actual world. 

P™" 

I take it, then, you have not yourself 
thougff^irough how this survey could be conducted, 
how many people would need to be interviewed, how 
long the interviews would take, how long the data- 
collection process would take in the counterfactual 
model or world that you describe on page 10. Is 
that correct? You’ve not thought that through? 

A. In a general way I’ve thought about what 
kinds of data need to be collected. I have not 
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designed specific questions. I have not costed 
particular survey instruments. I have not 
contacted people who write questions. 

Q. In other words, you have not — Well, 
strike that. You've answered the question. 

In your proposed model construction, 
wouldpieMcal records have to be accessed? 


which 


count 


aJ 




For this part of the model construction 
to do with smoking prevalence in the 
itual world, no. 

How do you determine quit rates? 

How do I determine quit rates? 

How would you determine in constructing 
irfactual world quit rates? 

It really relates to the other part. The 
t have a claim about this misconduct 


the c 


plain|g£ipif|s have a claim about this misconduct 

I 

havinf^SSi effect on quit rates in the 
counterfactual world. They are going to combine 
assumptions with some pieces of data that they need 
to be estimated. When I know what their claims are 
about which pieces have to be estimated and what 
assumptions they’re relying on, I can go forward 
and help them design a survey to gather the 
necessary pieces of data to complete that modeling 
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e f fort. 

Q. But I understand on page 10 at least you 
have variously quit for five years, quit for 20 
years, quit for 25 years. Do you see that there? 

A. Yes, I do. 

And I know this is an assumption or 
hypotb«t:^pal but I'm asking you: If you wanted to 

get alrtnfu quit rates or the quit experience of 
varioiL^Lndividuals and you wanted to estimate the 
impac^^ the tobacco industry conduct on it, would 
harvesto do a survey that discerns when people 
he reasons they quit? 

Okay, let me try to be clear, because you 
used Ford "actual" so now we're talking about 

the actual, quit rates in the actual world# Because 

Sunnis 

you u aEratfesA he word "actual" in your question. 

lb 

i n the factual world. 

A. In the factual world? 

Q. Yes. 

A. You would do a survey from actual-world 

data, so we're now talking about something else. 

I want to make clear this whole last line of 
questioning was about the counterfactual world and 
now you've moved to the actual world. Correct? 


spilt 
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Q. Are you going to use one to extrapolate 




>“*5 


'■d. 



to the other? 


A. No. It may be informative about the 
other, but the questions are really entirely 
different. 

Then how many people do you have to 
survey-te»| determine quit rates and the reason 
people^qTnLt in the actual world? 




actua 



quit ? 


Ay 



In order to estimate prevalences in the 
Id of different smoking behaviors, you do 
o ask them why they quit. 

Do you have to ask them whether they 


One would think so, if you want to know 


Swi 

a 


whetheyrtaey quit or not. 

S What sample size do you need? 

If you want to estimate that within these 
cells, by cells X mean defined by these background 
characteristics and other health-related behaviors, 
you would need — How big a sample size would you 
actually need? It would depend on.... Well, let 
me pause for just a second just to make sure that 
I'm talking about a purely survey operation now, 
nothing to do with what you would've done had there 
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isn 

n£ 

6J 

CJ 


a 

c. 

pSipp 

fS 

igjplpf 


been Boinething else going on. It’s a different 
enterprise. You may want to ask those questions of 
the same people, but this is just a survey about 
actual-world prevalences. Presumably you would 
want thousands of people in a survey like that. 

Would you want to know their year of 
birth Jr - bS|c , race, income level, education, baseline 


mental 


facto: 


Cu 



de s ig 


the s 


physical health, and other confounding 


Yes. 

And you would therefore have to design a 
that effect? 

Yes . 

How long would such a survey take to 
[ implement and analyze? 

It depends probably on the population of 
>up you're going after. 


Q. We're talking about the beneficiary 
population in Ohio. 

A. To address how long it would take and the 
expenses involved, whether dollar or time, I would 
probably call people who do surveys to get a better 
grip on the answer to that question. 

Q. So you don't know that? 


JONES FRITZ & SHEEHAN 


http://legacyjibrary.ucsf.ed8Aifli/(pi©i#?|^0)(|SwWA/.industrydocuments.ucsf.edu/docs/xygl0001 





187 



r / 



Donald B. Rubin, Ph.D. 


A. I have a vague idea but it would be kind 
of speculative. 

Q. Tell me. 

A. For vague ideas, there would be thousands 
of people.... I'm sort of talking to myself and 


I apo 



e. Probably, depending on the number of 


peopl< r|C ili had involved at the beginning, it would 
be prtfoarijLy a month at a minimum to design the 


gue s t, 
there 


lire and the survey simultaneously because 
different kinds of people involved in 


ng erent activities. Presumably you would 


1 4 

. i 

f eelm 


surve 


I thi 


Depen 
up fr 



®S£S3SsS?88fv^ 


it by telephone. Do I have a good 
how long it takes to do a telephone 
’hey're pretty efficient now, I think. 
iat technology has gotten pretty good, 
i how much money you wanted to throw at it 
how many telephone interviews we’re 


calling at the same time, whether it's sequentially 
or in parallel. I imagine such a thing could be 
done in a few months. 

Q. Kind of the same time you’ve had to do a 
multiple imputation from the data of NMES. 


Correct ? 


MR. BIERSTEKER: You don’t have to 
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answer that, Doctor. 


BY MR. WITHEY: 


Q. About that same time, a couple months? 
A. Would you clarify the question, please? 


Q. Would it take around the same time that 


you *vq 


analyi 


testi 




l to conduct a multiple imputation 
>f the NMES data? 

I think you're mischaracterizing my 


ft .. 


weeks, 
equivj 

I 

clarif 


that. 



I apologize. 

Because I think we said repeatedly that 
m involved in trying to do it for a few 
so I don’t know where three months is 
to a few weeks. But if you would 
y you think it is, I could continue from 


Would you have to ask these people about 


their smoking history? 

A. It would be useful, absolutely, to know 
because that's a relevant health-related smoking 
behavior. I think that most people would agree how 
long you've been smoking and how intensely you have 
been smoking is related to, associated with these 
health outcomes. 
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W4 

Q 

Ck 


Q. Now, you filed reports that such data 
collection, even going back to the AG-case Medicaid 
recipients but in this case beneficiaries, was 
needed, I think, in order to properly assess the 
effect of industry misconduct, if any, on smoking 


pr eva 


me 3 u 


kind 


you ' v 


1,4 


tpic 


the a 


at th 




within that population. Correct? Let 
arify my question. In other words, this 
art has appeared in previous reports 
mitted in AG cases, chart 10? 

That is correct. That chart has appeared 
tes the type of thinking that has to take 
hough it is also true that I've said that 
is certainly does not have to take place 
ividual level but has to be only 


aqqregateil. I think I say that on the top of page 


15, t 
analy 


rom a statistical perspective such an 
n each counterfactual world need not be 


done for each individual. 

Q. I assume that is inherent in your 
statement that it would take thousands but in this 
case there are hundreds of thousands of 
beneficiaries, so you’re saying you don’t have to 
survey each individual person? 

A. Correct. Well, more than that. I’m 
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sayxng -- 

Q. You would aggregate the data from those 
thousands or so that you interviewed. Correct? 

A. It depends what you mean by aggregate. 

I'm saying the question is you don't have to 
interval every individual, nor do you have to 
.y have data from each microtype of 
There is some aggregation, some 
tat takes place to smooth across types of 




such 


Now, did you discuss doing such a survey 
:neys for the defendants in the Ohio 
s case? 

Did I discuss doing such an analysis, 

:vey? 

Proposing such a survey, yes. 

I had one or two brief conversations with 
Mr. Biersteker. 

Q. When? 

A. Oh, this probably would be within the 
last week, maybe ten days. 

Q. At any time prior to that had you had any 
discussions with any defense counsel in any of the 
tobacco cases about conducting any such survey. 
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whether it be through Medicaid recipients or 
beneficiaries? 

A. I think for a while we've had general 
conversation saying here are lawsuits involving 
billions of dollars and here are people trying to 
grab off the shelf that happened to exist and 



want to — they seem to not want to 
ever it costs in terms of thousands or 
reds of thousands of dollars to collect 
really addresses the specific questions, 
e keep analyzing the same sort of old 
ver and over, and for some questions 
t be nice to perhaps collect some 
“tr™&ata. 




(^^When did you first have such a 
conve^imon with defense counsel? 

Let me again be clear on the answer to 


the last question. We’re talking here about 
smoking prevalence. That’s what we’re talking 
about, the surveys on smoking prevalence in the 
actual world and gathering information on what 
smoking prevalence would have been in a 
counterfactual world without certain acts of 
alleged misconduct. Okay? That’s the context in 
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which we're talking. Okay? 

It ie possible that those 
conversations may have gone back a year. Because 
just generally as a statistician, not as a 
statistician who does much litigation but just as a 
stati an, if there are questions that arise for 

which p?e”Yant some information and it seems like 
existrngjilatabases don't provide good estimates of 
those |p^jg&es of information, like prevalences in 
certaJli^^ibpopulations and prevalences in the 

Intdjg^lAtual world, then the natural inclination 

r 1 , 

8 a £^SP8f.stician is to recommend getting some 
data, ■* :his case on these prevalence measures. 

And you did so recommend it? 

I don't know that I recommended it. 

I sai \ natural inclination is to recommend. 

I’m siHPI had — Something funny again? 

Q. Yes. Did you recommend is all I'm 



asking. 
A. 
Q. 
A. 
Q. 


Is that funny? I don't know. Anyway.... 
Yes . 

It is? 

Yes . 

Did you make that recommendation? 
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OA. 



__■> 

HTl 

^W7 

C5t 

Cii 


You said it is the natural inclination to recommend 
and I'm asking you, did you make the 
recommendation? 

A. I don't believe I made a recommendation. 

I believe I probably said have you considered, 
uni And what response did you get? 


surve 


Jp I don't remember precisely, but I think 
it wal^^wall, that's a possibility." 

Did you offer to make sure that the 
surve^^pifiducted was done according to the proper 
Ivfcl t i s^T Wk 1 methodology? 

I don't remember making that offer. 

I may ^bj[ve said something like "And I could 
probalPry“"4ielp do it." 

And you would have, I assume, obviously 
gottei teq p ^ Ld for that if you were asked to do that? 

I would certainly hope so. 

Q. And you have not been asked to do that. 
Is that correct? 

A. That's correct. 


Q. Did you ever give an estimate of the time 
it would take, your time I mean, for you to involve 
yourself in assuring proper statistical 
methodology? 
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A. In a survey for smoking prevalences in 
■the actual and counterf actual worlds? 

Q. Yes. 

A. I don't: believe so. 

Q. I asked you when you first had such 
conve ons and you said about a year ago. With 

whom did"You have those conversations? 

I said maybe about a year ago, I believe. 
If I t say "maybe" I should have. 

All right. 

I am almost certain Mr. Biersteker was 
^ol-\f®iP^nd there may have been another attorney 
as parK ^ the conversation. 1 just don't 
r ememlfinrrl 

& You don't remember his name or her name? 
Awid The only possible — I mean, I have a 
very vireptPb recollection but I don't think it's 
right, so I don't know. I really don't know. 

Q. Did you discuss with either 

Mr. Biersteker or this other lawyer or lawyers how 
much time it would take for other disciplines to 
conduct again this same survey that we've been 
discus sing? 

MR. BIERSTEKER: Objection, asked and 


<ji 
♦— 1 
kO 

Ln 
O 'i 

cj-> 

u> 
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answered. 


A. How much -time it. would take other 


disciplines? You mean — ? I'm lost. 


Q. What other disciplines or subspecialties 


would have been involved in such a survey? 


Oh, I apologize. 1 now understand. 


t’s fine 


organ 


Well, there are professional survey 


ons that help people design surveys. 


pr ete 

l^gyjjniB 

t _ 

or^de 



e surveys, conduct the surveys, have 


eople who are expert at doing that kind 


at rounding up the right telephone 


interv 


epide 



and so forth. 


Would you want to have a consultant in 


gy particularly to the extent to which 


you a 


facto 


ing to ask questions about other risk 


at a particular person would have been 


exposed to? 


A. Sure, that could be helpful, although we 


do have this large list of background 


characteristics that the various plaintiff suits 


have used. So we have in some sense the input of 


epidemiologists such as Samet in Minnesota and 


other plaintiffs as well as some from defendants. 
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H 





i M l i 

Q 

Q 



Q. Would you want someone trained in 
psychology to advise this survey? 

A. I wouldn't mind; although I think that 
the professional survey houses who design these 
things have that kind of expertise with them. 


I be 1 


psychi 


.j 

ye"* 


I think they have cognitive 
sts who worry about the way you should 


frame ^questions to get complete answers and things 


like 


f/D l 


Would you have wanted someone in 


iinatToha 1 exposures or occupational medicine, 


omeo 


blue c 


be co 



'urvetisJ 1 


o would know what exposures working-class 
r people might be exposed to that should 
red as risk factors in conducting the 


Certainly couldn't hurt. 

How about actuaries? Any reason to 
involve actuaries if you are going to look at the 
issue of trust fund-initiated changes? Actuaries 
or benefit consultants, put it that way. Would you 
want someone on board in that discipline too in 
order to do this? 

MR. BIERSTEKER: And the "this" is? 

I'm sorry, Mike, I'm not sure. 


JONES FRITZ Sr SHEEHAN 


http://legacyJibrary.ucsf.eda^4®^? : ®QCWpfflWv.industrydocuments.ucsf.edu/docs/xygl0001 


51956 9536 



197 


Donald B. Rubin, Ph.D. 




BY MR. WITHEY: 

Q. We're talking about the same survey we've 
been talking about. You know what I'm talking 
about, don't you, Professor? 

A. I have a vague understanding and in my 
111 try to explicate, if that's okay. 


an swe 


it wo 
consu 
call 



Oi 

<0 

Cn 

cr> 


<sO 

Cn 


Go ahead. I am now asking about whether u> 

'-J 


that 
the c 



e useful to have actuaries or benefit 
ts advise the survey team, if you want to 
hat, or survey agency, to make sure the 
stget properly formatted and formulated. 

A two-part answer, I believe. One is 
S? respect to t-K&" survey of individuals on 
nt prevalence of smoking behavior, I don't 
think^^t would be particularly helpful. When 
you’rfeffl^lking about the counterfactual prevalences 
in a !mlSl-if world, then people who knew about what 
kind of anti-smoking programs the trust might have 
initiated and how people might respond to those, it 
would be useful, I believe, to have somebody who 
would have some understanding of what kinds of 
programs the trust might have initiated. 

Then also there is a survey with 
respect to that as to which a person like that 
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would be useful when you actually survey the trusts 
themselves and try to either dig into the records 
or survey people who are involved in them to see 
why they did not initiate those kinds of programs 
and what would they have done without the alleged 
acts .sconduct, both what they did do and what 

have done without. So the parallel 
the trusts is going on with the 
.s in order to get their actions with and 
te alleged misconduct. 

After this conversation you had perhaps a 
rith Mr. Biersteker and perhaps somebody 
you can't recall and the time when you 
this on one or two occasions with 
Mr. B i e r sib eker in the last few weeks, were you 
aware ny activity or actions that were taking 



place^^rd or related to conducting such a 
survey ? 

A. Am I aware of any actions that were 
taking place? 

Q. Did you have any other discussions with 
anybody about the status of that idea or concept of 
conducting such a survey? 

MR. BIERSTEKER: With respect to the 
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Ohio case? 


case. 


Medicaid now? 


MR. WITHEY: No, with respect to any 


MR. BIERSTEKER: You're talking about 


MR. WITHEY; Any case, Medicaid, 


b e n e f df c i i e s in any health care recovery case. 

AT*™!I believe Mr. Biersteker and I had a 


conve 


days 


.on over the phone maybe three or four 
tbout that survey but not with respect to 


fig * s 

»WSn 


’ s Ifefi contacted or what state it's in. It was 


fairly general level, talking about what 


kind oT^ganization might be useful to contact. 


hired 


had t 


your 


;As I understand it, then, if you had been 
U say approximately a year ago when you 
j.rst conversation to do such a survey, 
[estimate would be it would have been 


completed certainly by November 6, 1998, let’s say? 


A. Well, my best estimate has some 


uncertainty attached to it because the first thing 
that I would have done, if they wanted to go 
forward, would be to put them in contact with a 
professional organization that did that. 

Q. And who would that be? Do you have 

JONES FRITZ & SHEEHAN 


http://legacy.library.ucsf.edadiQ!j!^d[QtpEEQCW|QiBWv. industrydocuments.ucsf.edu/docs/xygl0001 






2 0 0 



Donald B. Rubin, Ph.D. 


someone in mind? 

A. Well, there's a list of them. I don't 
know people. I mean, I know people at some of 
these, but there's Roper who does polls, Westat 
does surveys, Gallup does some polls. NORC, which 


xs as 




ted with the University of Chicago, does 


some y olM ng, I think. Maybe ISR does polling. 

That’s fine. Just needed a sample. 




are f 


SS£SS££ 

™ irirJfc t 


Some are academically affiliated and some 
ofit polling agencies. 

So I take it you would actually want to 
ow long it would take to do such a 


surve 


41 

Os 



H Absolutely. They are the professionals; 

1_ ' 

they have! the staff of people either ready or not 


ready 
the 1 


ake the telephone calls. And presumably 
of time it would take would have 


something to do with what else they have on their 
plate at that time. 

Q. Now, are you aware whether defendants in 
this lawsuit have asked for formal discovery 
allowing access to conduct a survey for the names 
of beneficiaries? 

A. Whether they have asked for access — ? 
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Would you read that back? I'm sorry. 

Q. I can rephrase it. Are you aware whether 
the defendants, that's the tobacco industry, in 
this lawsuit in Ohio have asked for discovery of 
the names of beneficiaries in order to conduct a 
surve ave you had any discussion with -- Go 

ahead jr * ^ 

AT™j The act of discovery, I guess technically 
I shofcnow exactly what that means, but I don’t. 
I be 1 ipPfNPI i know that they have obtained some list 
ipecsnJLe! but I don't know whether it was through 
’WlactMM discovery or.... I don't know how that 
was dol 

00 

Mr. Biersteker told you they had a list? 
Or access to it or a hope of getting it. 

I just^ap^i't remember precisely. 

fl^When you talked about this with 
Mr. Biersteker a few days ago, tell me what you 
said and what he said. You can answer. 

MR. BIERSTEKER: I stopped him. But 

I think there is some confusion here, Mike. If you 
want me to try to clarify it, I will. If you don’t 
want me to -- 

MR. WITHEY: I want him to answer. 
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explain, if you want me to. 

MR. WITHEY: No. I'm asking if you 

are giving Professor Rubin that instruction, not 


discu 



MR. BIERSTEKER: Yes, I am as to the 

two or three days ago, yes. 


BY MRrWMHEY: 


benef 


/V{ 




relat 


the g 


heart 


vague. 


What is the best available data on the 
ies* medical expenditures in this case? 

I don't know. 

What is the best available data on the 
of smoking in this population? 

I don't know what the best available is. 
What is the best available data on the 
isk of smoking and specific diseases in 
1 population, including lung cancer, 
ase, et cetera? 

MR. BIERSTEKER: Object to the form. 


A. For expenditures? 

Q. No. What is the best available data on 
the relative risk of smoking and specific diseases, 
developing those diseases? 

A. Relative risk for? 
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12 

PfMMf 

TS 


dt 



Q. Getting diseases from smoking. Again, 
this is in the general population. 

A. So for morbidity, you’re talking about? 

Q. Yes. 

A. I don't know what the best available data 




S What is the best available data on the 
isk of mortality from smoking? 

I don't know what the best available data 


:ainly "best available” would mean 


might 


popul 


the b 


h™"'\ 


are. W^ainly "best available” would mean 

|lqj| etnT|v|p that was a properly designed survey with 

$s»s| 

ttWle {^^.ty to represent what the relative risk 
might n the option. 

(pHl’n talking about the U.S. general 

h“.\ 

popul a^pi in this question. You don't know what 
the bffewate p& va i 1 able data is on that? 

TJj 

For relative risk? 

Q. Of disease, yes. 

A. Relative risk of disease? No, I don't 

know, because certainly ACS-II was not 
representative of anything in the United States. 

Q. What is the best available data on the 
relative risk of Bmoking and specific disease 
endpoints in the beneficiary population in this 
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Again, relative risk for what? 

Of diseases. Of diseases. 

Mortality? Morbidity? 

Morbidity. 

Best available? No, I'm not an expert in 


databarseTS per se so I’ll just say I don't know 



relat 


* What is the best available data on the 


isk of mortality from smoking in the 


benefpapflSfry population in this case? 

Again, I am not an expert i 
ab|^H^ that are available so I'm : 


Again, I am not an expert on all the 


the b 


medic 


expen 



that are available so I'm not sure what 
ivailable data are. 

What is the best available data on the 




penditures, the total medical 
es of the trust fund plaintiffs? 
Total medical expenditures? I don't 


know. 

Q. Have you reviewed any of the data on 
medical expenditures of the trust fund plaintiffs 
for accuracy or reliability? 

A. No. 

Q. Have you made any attempt to determine 
whether the medical costs of smokers exceeded that 
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of non-smokers in the beneficiary population in 
this case? 


Q. How would you perform such a study, if 
you would? 


yH Such a what study? 
f r. A study to determine whether the medical 
costs r oi smokers exceeded that of non-smokers in 


the b< 






^iciary population. 

^ Just a simple question of, are the total 

ni 

costs or average medical costs per person 
greater than per non-smoker? 


Yes. 

In the population from? 

The beneficiary population at issue here 
Starting in whatever year it was, 1972, 


throu 


Q. Fair enough. 

A. Well, certainly it is difficult to go 
into the future to determine that. And, as 
I understand it, the expenditure data is not only 
very thin in terms of background characteristics 
but I don't believe it even has information on 
smokers versus non-smokers. So to determine that 
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r a 

m mmd 

r-% 


paMK 

1, 2 

W3 



q e 



you have to make some assumptions bringing smoking 
prevalence information down to the medical 
expenditure information that is in the 
subpopulation. And also you have to make some 
fairly heroic assumptions about how those 


expen 


non- si 


^4 


es are distributed across smokers and 
s and types of people in time because it 


is latgeXv unavailable in that period of time 


legal 

Q|at 

■wt i 


.ejL tt 


that r 


even 


mathe 


expen 



i—. 

matic 


Do you have any understanding of the 
for sufficiency of evidence as it 
j — Strike that. Do you understand that 
test of sufficiency of evidence is one 
! proper to make reasonable estimates, 
jh they may lack statistical or 
:al precision, in estimating medical 
:es attributable to smoking? 

MR. BIERSTEKER: Object to the form 


of the question. And I object further because 
«I believe it mischaracterizes the state of the 
1 aw. 

BY MR. WITHEY: 

Q. You can answer. 


A. Will you reread it? 

Q. 1*11 state it again. Do you have an 
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understanding that the legal test for the proof of 
the plaintiffs in this case is one of reasonable 
estimates rather than statistical or mathematical 


precision? 



1 9 ue r® 


MR. BIERSTEKER: Same objection. 

I’m not sure how to answer that because 
doir*^ know what some of the words are 


suppofeajto mean in that context. 


P r o f e 

£2 t"M 

mmi a 

j ?r nnanimA-tl---Of 


from 


deter 


State 


expli 



|hJ 


"I don't know" is a fine answer. 

That's fine. Thank you. 

Do you believe it is possible to 
ut the misconduct of the tobacco industry 
ctions and choices of smokers in 
g consumption of cigarettes in the United 


It is possible to estimate that under 
stated assumptions supplemented with 


data, yes. 


Q. Is there more than one cause of cigarette 
consumption in the United States? 

A. It is a colloquial use of "cause" in that 
sentence, but I think I know what you're driving 


Q. Let me rephrase it in light of your 
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r 

5 


p—| 


J2t 


0A 


n» 

Q> 

(U 

■n f X - fr Wfhr 

- dSgffgggfe 

rr f 

'pmgo&i 


concerns. Fair enough. 

Do you believe that there is more 
than one substantial factor that determines or 
affects the consumption of cigarettes in this 


country? 




It is not something I have studied 


profea eio yally, but as a layman and having read 
thinglTp^er pressure is apparently the dominant 


f acto 


we wo 


k J. 



^nd that's not the only one or obviously 
fiot have this litigation. 

f Are there any other causes of cigarette 
|>n other than peer pressure that you 

List? 

* 

| Well, again "causes" meaning are there 


belie 


other kt hi& g s that affect —- ? 


rathe 


Substantial factors. In other words, 
n de minimis negligible factors, anything 


else that affects in a substantial way the amount 
of cigarettes that are consumed, not just 
individually but overall in society. 

A. Well, presumably there are factors such 
as people enjoy them. It can be a social thing to 
do. Presumably advertising has an effect. I think 
we sort of described a list of things before. 
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Let's see. 

Q- Well, you did, and we don’t have to 
repeat them. The things you listed before then 
would be — 

A. Sort of, yes. But the question before 



I thi 




as tied to the alleged misconduct piece 


more this question is, and so I don't want to 


say w 


them 


P4 


sale that all those apply, but some of 


n 


into 



lay I finish the sentence? 

Yes, of course. Go ahead. 

But some of them would. 

Let me just ask you if we can divide it 
leneral areas, and there may be others, 

. you to consider this: that over time 


but I 


the cpnsujnption of cigarettes has been influenced 


et ce 


tantial way by smokers' habits, choices, 
, and that includes I suppose peer 


pressure, people enjoying it, it's a social thing 
-to do, their parents did it. There's a whole 
series of things. Fair enough? 

A. Fair enough. 

Q. As well as certain conduct, whether you 
call it misconduct or not, of the tobacco industry, 
advertising, promotion of cigarettes, et cetera. 
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Fair enough? 

A. Presumably, yes. 

Q. Well, you believe that, right? That 

those are at least two large areas that have caused 
people to consume cigarettes overall in an 
aqgretp^l sense? 

MR. BIERSTEKER: In his non—expert 

he previously stated. Right? 

That's not what I do professionally, but 
certa|HW one would think that advertising and 
jjmotl^DrP by cigarette companies has an effect 
lau^i^f^esumably they spend huge amounts of money 
doing lOind they think it must have an effect. 

• ^ Now, on page 13 of your report. Exhibit 

1 -- ILJUbui nk it's Exhibit 1, isn't it? — you 
diBcu^Hl^iother chart. Do you have it there? 

Yes, I do, I believe. 

Q. And that includes a factual world with 
alleged misconduct starting in 1965 and then the 
counterfactual world where the defendants actually 
disseminate information on health risks. Do you 
see that there? 

A. Yes. 

Q. And you have described then the essential 
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characteristics of a proper data-collection and 
modeling approach. Do you see that? 

A. Correct. 

Q. I assume, then, the essential 
characteristics of a proper data-collection and 


model; 


JpM* 


pproach include determining date of 


birthr family history, smoking habits, diet, the 


defenffife 1 actions in withholding certain 


inf ori 


indiv 


>n, the effect of that on those 

.s, then the actual end course, in this 


e ^^ontracts lung cancer." Obviously that's not 




them f 


essen 



colie 


Go ah 


|lUNWI>Kia^ 

|tiQp 


erybody, but at least what happens to 
rms of their health status would be an 
characteristic of the proper data- 
methods that you've described. Correct? 
MR. BIERSTEKER: Objection, compound. 


A. Correct in a general sort of way. 

I mean, just as you pointed out, "contracts lung 
cancer and dies" is not an essential characteristic 
of all people to do this analysis. Neither 
necessarily are the other variables I referred to. 
But the point is these additional essential 
characteristics are broad ideas; they are 
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illustrated by the example as described earlier. 

Q. Let me ask you a hypothetical, if 
I could. Let's assume there are two members of a 
beneficiary population let's say in a trust fund 
health care payor; one of them smokes and the other 






12 

XL 


one d p^f^ iot smoke. Okay? 

fvl okay - 

One of them is a current smoker, long¬ 
term <£yj|k«3nt smoker, and the other one never smoked 

pSS8S3S8£S^ 

in hi|^KSl|fe. Okay? And let's assume that the 

S kePTm the first year contracts lung cancer. 

In the first year starting when? After 


term 


in hi 


what ? 



Let's just say in the first year under 


revieW^Aet' s say the first year of assessing 


wheth 


XL 

fib 


non-s 


ere is a difference between smokers and 


A. Okay. So smoker contracts lung cancer in 
• that first year? 

Q . Right. 

A. Can we give it a date just so we have 
something, like 1980? 

Q. Sure. 

A. Okay. 
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Wsmm 


>% 



ns 

Q 



people who could differ in lots of other things 
other than smoking? 

Q. Correct. 

A. And the question is what? 

Q. The question is: In that year, did the 


■ust pimi 


smokii 


1 incur more expenditures that were due to 
Lssuming the doctor diagnosed lung cancer 


as refareji to smoking, than for non-smoker s 7 


A simple smokers and non-smokers? 

I don ftfijliflive any idea. Did they pay more for the 
sorfZwno is a smoker than the person who is a 
? IPBH And the tautological answer is yes. 


they p 


But i 


smoki 


smoki 



p. 


•? And the tautological answer is yes, 
more because you told me they paid more, 
nothing to do with causality or due to 


But I asked you to assume it was due to 


A. No, you didn't. Do you want me now to 
«assume that it was due — ? 

Q * Yes. 


A. He was being treated for cancer? 

Q. Smoking-related lung cancer, correct. 

A. There are people who get lung cancer who 
don't smoke. So the fact is that the causal 
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question due to smoking has to do with the 
counterfactual world whether that guy would have 
contracted lung cancer and been treated the same 
way had he not smoked, so I can't accept that. 


Q. You can’t accept that? I am just asking 


you t< 
smokii 




;ume that the doctors diagnosed this as a 
dated illness that the person got because 


that person was a smoker. Okay? I want you to 


assumi 


That the doctor said this? 


Yes . 


c m • 

Or that it ' s t 


rue ? 


said i 





askin 


as sum 


Both that it’s true and that the doctor 


Just to be sure I understand what you're 
to assume, if 1 may, you're asking me to 
t in the counterfactual world, had this 


fellow not smoked, then he would not have any 
medical cost in 1980. That's what you're asking 
me. That's the only way I can understand what 


you're asking me to assume. So if you agree with 


that. I'll assume it. 

Q. What do you understand it to mean when a 
doctor says "I find that your lung cancer is caused 

_________ _________ 
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by your smoking"? 

A. That the doctor says your lung cancer is 
caused by smoking. 

Q. It means what as to whether he wouldn’t 
have gotten it if he hadn’t smoked? 




that ' 


this 


true. 



on’ wh 



The doctor is making up an opinion, 

1 it means. 

Okay. I want you just for the sake of 
thetical to assume that that opinion is 


And just to make sure that we both agree 
hat assumption means, I want to you tell 


me thtfKyou agree with what I think it means. 


becaul 


therwise I can't go forward. 


J In other 


words, you can't take my 


as sum 


truth 



n asking you to assume that the doctor was 
and was correct in saying "Your lung 


cancer is caused by smoking"? 

A. I am going to predicate my answer and say 
what I interpret that to mean is that the doctor 
has knowledge of God, that without smoking this 
person would have no medical expenses in 1980. 

Q. No, I'm not asking you to assume he would 
have no medical expenses. I'm just asking you to 
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assume that he determines that the disease and 
therefore the medical expenses that this person, 

his patient, incurred were attributable to his 
smoking. I'm asking you to assume that. 


A. Then I have to assume what I just said 


I hav 




assume. 


No, I'm not asking you to assume anything 


other mrign what I'm asking you to assume. Are you 


unabl i 


guest 


take my assumption and then answer 
on it? You have to make some other 


a itiylnlt assumptions in order to answer my 


.rgnal 

flu 

. .pwwi't 


means 


what 


it me 



No, I am saying what your assumption 

e and I want to make sure you're clear on 




assumption means to me. 

It doesn't mean what you said it means, 
•nly what I said it means. 


A. Well, if it doesn't mean what I have 
interpreted it to mean, I have no idea what you're 
talking about. 

Q. And you can't answer the question then. 


Correct? 


A. I cannot answer the question except in a 
way that makes scientific sense. I can answer the 
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question in a way that makes scientific sense; 
otherwise it doesn't, in my mind. 

Q. Oh, so a doctor saying that this person's 
lung cancer was caused by their smoking doesn't 


make scientific sense to you? 



mi sc hi 


j MR. BIERSTEKER: Objection, 

fterizes his testimony. 


BY MRrwlTHEYJ 


scxe 

* 



make 


sens 



true 


msd Answer that question. Does it make 
IP^c sense to you for a doctor of a person 
!a<tn smoking for let's say twenty years who 
m lung cancer, to make that diagnosis and 
ail causation link doesn't make scientific 


He can make that claim. All right? But 
e told me to assume that that claim is 


Q. That’s right. 

A. Now I have to tell you what it means to 
assume that that claim is true. 

Q. Let's just assume then that he made the 
statement. 

A. But I don’t have to assume it's true? 

Q. Assume he made the statement and that for 
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H a 





J2t 

12 

> mk 

u 5 


a 

On 


purposes of medical science, for purposes of the 
expenditures of the trust fund, everyone in that 
process agrees that the smoking caused that cancer* 
Okay? Make that assumption. Okay? 

A. Who are everybody in the process? 

The doctors, the patient and the health 



' \ he woiLu|Lh 


care yayar that made the payment. Okay? 

UJLj 

So they all agree that had he not smoked 
he wo\ LfejM have no medical expenses in 1980? 

E No, I'm not asking you that. I know you 
d that to it. I'm not adding that to it. 

May I finish? Because I was going to try 
to adi^^nething else that may be helpful, if 


I cou! 


Q. J Fin 


ease finish my answer. 


Excus 


Finish your answer. Go ahead. 

I have to reconstruct where I was. 
a moment• 

Another version of that is you want 


me to assume that had the person not smoked, he 
would have no medical expenses for the treatment of 


lung cancer 


Why don’t you go ahead and make that 


ass umption 


A. So you are willing to — ? Okay. So 
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that agrees with what you've said. 

Q. I will incorporate that within the 
hypothetical — 

A. Okay. Now what's the question? 

— since you can't answer it without 
it assumption. 

Well, it doesn't make sense otherwise, 

I can't answer it. 

Doesn't make sense to whom? 

a I don't think it makes sense to anybody 
............ s about what causal inference is. And it 

may b^^Mfiny to you. That's fine. Doesn't bother 
me. 

<jrV"‘ -J 'i I'm asking it doesn't make sense to the 
patier ktoopi^ the doctor tells him "Your lung cancer 
was c^H^ by your smoking,” what doesn’t make 
sense rlnucnit that to the patient? Is the patient 
likely to say, "Gee, you mean if I hadn’t smoked 
I wouldn't have had any medical expenses this 
year?" Is that common sense about what a patient 
might ask? 

A. I think that -- I don't know whether a 
patient would ask that or not. But that’s what the 
patient is probably thinking: Had I not smoked. 


i— % 

tn 

CJl 
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H i 

■ 


*"9 


>% 

rti 

—i, 2 

^5 


C5^ 

^¥9 

Qd 

SSSgiQi: 


I would not be here being treated for lung cancer 
right now. That's what the patient is thinking. 

And you disagree with that? I don't know how to — 

Q. No, I'm saying you can include that. 
That's different than saying he wouldn't have had 


any o 

mmm 

AT ^ 


inter 


eg 




medical expenditures for any other 


That's why I wanted to continue when you 
d to offer that as a helpful addition. 

So we're clear in year one the non-smoker 
ical expenses, the smoker has medical 
hat are caused by his having lung cancer 


which caused by his smoking, and that if he 


hadn 


and w 


smoki 


f^smpked he wouldn’t have got the lung cancer 
'** ^ ave expenses for lung cancer. Okay? 

a p^j That's what we mean by "caused by 
fair enough. 

Q. Now, has the trust fund incurred smoking- 


attributable expenditures in that year? 

A. They have for that person. Although we 
don’t know without the other part of the 
hypothetical whether they would have incurred more 
or less expenses for other diseases, because we 
don’t know. You haven't told me about his other 
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diseases. You won't give me that. 

Q. I'm just asking you, has the trust fund 
incurred smoking-attributable expenditures given 
the hypothetical I've given you? 

MR. BIERSTEKER: Objection to form. 



I thi 


incur are 



Well, that's not quite the same question 
>u want to ask, has the trust fund 
imoking-related expenses as a result of 


his s a. Right? 

S Look, I know you want to rephrase my 

, If you can’t answer the question I'm 
ffii nJiMi, say "I'm sorry, I can't answer it.” 


And tH 
be ab J 
to my | 
hypot Hj 


expen 



, say "I'm sorry, I can't answer it.” 

'll ask you, if I rephrase it, would you 
answer it. But I would like an answer 
tion. Has the trust fund in that 
al incurred smoking-attributable 


MR. BIERSTEKER: Objection, vague. 

A. Smoking-attributable expenditures? 
Expenditures for smoking-related disease? 

Q. Yes. 

A. Okay, that's what that meant? 

Q. Mm-hmm. 

A. Yes, they have incurred that because 
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we’ve balked about what that meant. 

Q. And that amount was $10,000 in my 
hypothetical. Correct? 

A. Given the caveats we both have agreed on. 


yes 



Now, if you were to compare the medical 
es in that trust fund for smokers as 
non-smokers, would you find that 
6dical expenditures are higher, the same 


as, o|Ms than non-smokers*? 

a? - : 



raUWHMeeg). 


1 have no idea. 

MR. BIERSTEKER: Object to the form. 

Because we talked about one person. 

That's what we're talking about. There's 
only in this trust fund. That's the 

hypot faffl^fc al. There's only two beneficiaries in 
this union bargaining unit. 

A. Either reword the question or ask it 
again and I'll try to answer. 

Q. Has the trust fund incurred higher 
medical expenses, the same medical expenditures, or 
less medical expenditures for smokers as compared 
to non-smokers? ( - n 

A. Simple descriptive question, nothing to 


JONES FRITZ & SHEEHAN 


http://legacyJibrary.uc^edBi(^i^d[QtpEEQCWpyiBWv. industrydocuments.ucsf.edu/docs/xygl0001 


1956 9564 






225 


Donald B. Rubin, Ph.D. 


do with causality; there's one smoker, one non- 
smoker. They incurred more expenses for the smoker 
than they did for the non-smoker. Nothing to do 
with causality at all, not due to smoking, just for 
the smoker and non-smoker, I agree. That’s just a 


tauto 



But if you add to it the causation 


elementTjhat the reason the smoker got lung cancer 


was si 
smokii 

jggjg < 
'iff^ot] 

part < 

trust 




ig and that had it not been for their 
ey would have incurred no expenses for 
r in that year, if you add that 
al, which is what you asked and which is 
e hypothetical; would you agree that the 
had greater expenditures for smokers 


than i|ory^inokers in that year? 

a No, not necessarily. 

Why not? 

A. Because if you had added more 
expenditures for smoking-related diseases, then 
I would agree. But had the guy not been a 
smoker — this goes back to the other part of the 
question -- had the guy not been a smoker, maybe 
he would have had massive heart attacks, massive 
strokes. Who knows? I'm not saying that's going 
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to happen. I'm just saying if you want to ask that 
question, that expense in total, we have to make 
that assumption that you don't want to make. 

Q. No, I can make that assumption. 

A. Okay, fine. 

But your testimony is then the trust fund 
sr expenditures for smoking-related 
han for non-smokers or non-smoking- 
seases in that hypothetical. Correct? 



Yes. 






want you to further assume that in year 


81, the next year, the smoker dies of 
lung ca ^ r . 

in™H Okay, mm-hnun. 

Q^JAnd the non-smoker gets a rare eye 
disea^i^^lled myelodysplasia. It's a rare disease 
and h«rTn:reated for a year and the costs are 
$10,000 and the doctors determine that the 
myelodysplasia is not caused by smoking, it's not 
caused by anything that they can determine is 
related to smoking, it's just a rare disease. 

Okay? 

A. May I ask for clarification? * 

Q. Sure. 
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A. So that means, for example, if the guy in 
this counterfactual world had smoked, for instance, 
he would still have the disease? 

Q . Correct. 

A. No matter what happened with respect to 
smoki^^e would still have the disease. 

FVl Right - 

mmm 

A. ’ So nothing has changed for him in the 

count&ggj;g|fctual versus the factual world. 

Now, in that year are the smoking- 

. liH „ 

: r lBut ab 1 < 


.stable expenditures greater than expenditures 
t pPP^iot attributable to smoking or less than? 

MR. BIERSTEKER; I am going to object 
to foffir - 'but you can go ahead. 

P Continuing the same assumptions from 
b e f o r e y are greater. The smoking-attributable 

expendHrUes are greater. They're due to smoking. 

Q. No, I'm just talking about in the second 
year. I'm not talking about in the first year. 

A. Oh, because the fellow died. I’m sorry. 

I missed the question. 

Q. All right, let me go back again. The 
second year, we're talking about the second year, 
the smoker who had the lung cancer died. He’s 
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gone . 


A. He's gone, right. 

Q. No expenditures. All right? 

A. Correct. 

Q. The non-smoker develops myelodysplasi 


and i 


smoki 


r fe&aegfe $ 1 

H Bec 

Cor 


$10,000 of expenses, not related to 
Because he didn't smoke. All right? 


rect. 




fund 



Now, in that circumstance, did the trust 
any smoking-related expenditures? 

In 1981? 

Correct. 


HAnd I would assume you would agree that 

H—\ 

the e xp e njit u r e s for non-smokers exceeded that foj 


An 



smoke 


that year? 

In the simple descriptive sense, yes. 


But not in the causal. 

Q. Now, let me ask you this: When you 

combine the two years, 1980 and 1981, did the trust 
fund incur smoking-related expenditures in the way 
we've defined it of $10,000 for the combined two 


years ? 


A. Due to smoking? 
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Q. Yes. 

A. Yes. Causally they expended $10,000 more 
than they would have in the absence of smoking 
under the assumptions we made where the smoker, had 
he not smoked, would not only not have had lung 
cance id had no medical expenses for lung cancer 



but he- vrauld have had no medical expenses at all in 
-80 oNl. 


medic 




h e - _ 


You have to make that assumption. Ho 
xpenses in '80 or '81. 

For whom? 

For the smoker, had he not smoked. See, 


expen 



No, we’re not talking about no medical 
es. You're changing it. We’re saying no 


medic^^xpenditures for lung cancer. 

Okay, if you want to do that, then we 

h r 

have 

Q. That's what we agreed upon. Let's don't 
gobackonit. 

A. Two questions back you agreed on the 
other part too. I’m perfectly happy to talk about 
expenses for treating lung cancer. Okay? So 
that's the question now? But the answer still 
remains. 
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May I clarify that? You have to 
assume that the non-smoker, had he not smoked, 
would have none. So without smoking, the smoker 
without smoking in 1981 would not have incurred any 
costs for smoking- related disease. 

No, we don't. 

Yes, you do. 

Now you're going back on it. 

a** 

No, I'm not. Please listen to what 

r -? 

El. J 

I sai 

I am listening to what you said. 

MR. BIERSTEKER: I think you’re going 
, Mike. 

MR. WITHEY: The record will be very 

his . 

MR. BIERSTEKER: I think it will be, 

and ytflF^ne going to be wrong. 

BY MR. WITHEY: 

Q. What you're asking me to assume and what 
you want me to put in my hypothetical is that the 
smoker, had it not been for smoking, would not have 
had any expenditures related to his lung cancer. 

A. In 19807 

Q. In 1980. 


back 


clear 
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A. Okay. 

Q. In 1981 he dies, so there are no 
expenditures whatsoever. 

A. We're talking about a counterfactual 


world now. 


world 




go on 


I'm not talking about a counterfactual 


Yes/ you are. 

I’m talking about my hypothetical. Let's 
've been very clear about this? it's on 


S reyipra. We've been very clear what we mean by 

4pw«S& 




■ feco ra 

ri ^ 


es related to smoking and I have changed 


my hypo°tne tical based on your suggestion. Let's 


say n 


de ve 1 


year three, 1982, medical science 


on i 


^. 

b 1 cdsJ cure for myelodysplasia. Okay? 

kSsSSSkBs^ 

$NHNj Well, if we can’t a 
in WrPa, which I don't thi 


agree on what's going 


think we have quite yet. 


I’m sort of reluctant to go on to 1982. But if you 
want to go on to 1982, you're asking the questions. 

Q. Let me ask: In 1981 what is the relative 
cost for those two years, '80 and '81, for treating 
smokers versus non-smokers? 

A. Just the costs, nothing causal? 

Q. Right. 
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A. Then the costs for treating smokers are 
$10,000 and the costs for treating non-smokers are 
$10,000. The costs for treating smoking-related 
diseases or cancer, specifically, the costs are 
$10,000 for the smoker total for *80 and *81 and 
zero |the non-smoker for '80 and ’81. If we now 

want lake the question causal, we have to be 

more If ul . 

Let's make the question causal in the 
had the smoker not smoked, he would not 
sve^loped lung cancer and would not have 
|any medical expenses related to lung 

In 1980 and *81? Before it was just 



19 8 0. a, .Yd i now want to assume that he would not 


have 


. n< 

ipslrred — ? Because he might have. There 
are peufrie who get lung cancer who don't smoke. So 
in this counterfactual world without smoking, I’m 
just trying to be explicit. 

Q. But I'm trying to ask you to assume my 
hypothetical rather than fight it. And I am 
assuming that even though it is true generally that 
there are people who get lung cancer who don't 
smoke, that in this case this patient got lung 
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cancer because he smoked. Okay? 


MR. BIERSTEKER: Objection. Is the 


question complete? 


MR. WITHEY : Yes. 


BY MR. WITHEY: 




I want you to assume that. What was the 


r e 1 a t o s t attributable to smoking versus non- 

smokiifgi i thart ||ust fund for those two years? 

MR. BIERSTEKER: Objection, 


r ncom 


C+* 


hypothetical, and vague. 

For treating cancer, lung cancer, 

hat had he not smoked he would not have 


g o 11 e n^nTh g cancer *in either ’80 or *81, the causal 


costs 


versu 


years 



to smoking are $10,000. 

And what is the relative cost of smokers 
l-smokers in that trust fund for those two 
sating all diseases, the relative cost? 
Treating all diseases? Just the relative 


cost, just descriptive, no causal? Or causal? 

Q. Right. 

A. Which? 

Q. Not causal. Just what is the relative 
cost of treating — I'm going to ask you two 
questions. One is causal, one is not causal. What 
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is the relative cost of treating smokers versus 
non-smokers in those two years in that trust fund? 

A. Summed over those two years it cost 
$10,000 to treat each of them, so purely 
descriptively the ratio is 1, if that's what you're 
1 ookir T^^^o r. By relative you mean the ratio? 

That's what I'm asking, yes, one related 
er, the ratio of one to the other. 

Okay. You could have meant something 
at, but .... 

No, I meant what you meant. 

Okay. 

So therefore the relative cost of smoking 


to th 


else 



13 

ii 


versu ^ i‘ u|i -Binoking has gone down under your 
definitigii or under your answers from $10,000 in 

one y & m & o zero for non-smokers, $10,000 in one 

LgJLs i 

year Iwr^ero in another, looking at those two 
different years. 

A. You asked a entirely different question. 
The descriptive one is just looking at the numbers 
you gave me over '80 and '81 and you get $10,000 
for one for the smoker and you get $10,000 for the 
non-smoker, so $10,000 equals $10,000. The causal 
question, which is extra costs that are due to 
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H 




smoking for treating lung cancer summed over '80 
and '81, is $10,000 more due to smoking in that 
example. 

Q. Right. But for the two years combined 
the relative costs of treating diseases caused by 


GSj 8 

; ^Q 1 


smoki 


smoki 



s opposed to other diseases not caused by 


s 1. Correct7 


No, not causally, no. 


It is not 1? 


Not causally, no. That's just 


cruitive . 


Cll^e 

g 




simpl 

plus 


now. 



What is it? 

The descriptive answer is 1. It's just 
lition of $10,000 plus zero equals zero 
>00. That's the description. 

But I'm asking you a different question 
not asking you on the descriptive level. 


I'm asking you to understand that the hypothetical 
includes a causation element. 

A. Causal, okay. 

Q. And you're going to assess, if you are 
asked the question, for those two years in the 
trust fund what ratio exists of medical 
expenditures caused by smoking versus those that 
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were not caused by smoking. 

A. Under our assumptions the ratio is 
infinite. It is $10,000 over zero, which is 
infinite. 

Q. I thought those not caused by smoking in 
the h letical was $10,000. 

Not causal, no. See, that's why I'm 
tryin|"To] be very, very clear. I’m not trying to 
fiaht Lwhfc&h you at all. I think this is a. 
fundai|liPil|il misunderstanding that you have. 

We have to talk about what would have 
ipefPPfn the world without smoking. Okay? For 
both ifSople. And in the world without smoking we 
.ng that the expenses for the non-smoker 



^ ft i 


are a 



Li .iam i n.. 


wou ld feteardentical; nothing would change. So- 
there^od for him there are no costs due to 
sraokii^T^ausal costs due to smoking. For the 
smoker you asked me to assume that, had he not 
smoked, in a world without smoking he would have 
spent zero dollars in 1980 for treating cancer, 
lung cancer, and zero dollars in 1981 for treating 
lung cancer. And you've told me that in a world 
where he smoked, with smoking, he spent $10,000 in 
1980 and zero dollars in 1981. So for him, for the 
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smoker, the extra cost due to the existence of 
smoking is $10,000. For the non-smoker the extra 
cost due to smoking is zero. 

Q. Did I ever ask you to answer the question 
in a counterfactual world? 

That's what we're doing right now. 

No, we're not. 

That’s what we're doing. 

I never asked you the question in a 
count^Hctual world absent smoking other than the 
IjOIr B-t fZfi moker. that if he hadn't smoked, he wouldn't 
^PP^en lung cancer. I didn't say in a world 
witholt fel^ja moking. Okay? I'm not asking you about 
in a PIFs'Trid without smoking. You've imposed that on 
the l^^^hetical and I would like you to keep it 
out. pHess you're saying you can’t answer the 
queslfflffil without it. 

HR. BIERSTEKER: That's why it is an 
•incomplete hypothetical. 

A. It’s not only that, but you already 
accepted it. 

I never did. 

Yes, you did. 

Q. I never did. 


Q. 

A. 
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MR. BIERSTEKER: Let's not argue. 

BY MR. WITHEY: 

Q. I said if it hadn't been for smoking, he 
wouldn't have gotten the cancer. 

A. We're talking about the non-smoker. You 
uld have had the exact same medical 
ether or not there's smoking. Smoking 
g to do with it. 

MR. BIERSTEKER: Can we go off the 

a minute and let me try to help? Do you 
ke just one minute? 

MR. WITHEY: Okay, let’s go off the 

hy do we have to go off the record? 

MR. BIERSTEKER: Because this record 
is a anyway. Do you want to hear my help or 

not ? hmm d 

p—^ 

MR. WITHEY: I don’t want him to hear 

your help. 

THE WITNESS: I’ll go to the 

bathroom. 

MR. BIERSTEKER: He can go to the 

bathroom; I don't care. 

MR. WITHEY: We’re off the record. 

(Discussion off the record.) 


said 
expen 
has n 

recor 

f£f t 

recor 
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BY MR. WITHEYi 

Q. Let me make sure you understand my 
hypothetical, that the smoker has no other medical 
expenses other than the fact that his doctor has 
determined —— which I'm asking you to assume — 


that 


smoki 


that, 


have 






lung cancer has been caused by his 
Okay ? 

And just to make sure we're both clear on 
t means that had he not smoked, he would 
ung cancer expenses in 1980, *81 or '82, 

going to go that far, to '82? 

Well, he died in '81. 

• No. But I'm saying had he not smoked, he 
I have died in '81. Isn't that correct? 


would 


You wantine to assume that? 


Yes, that's true. 


So what I'm asking to be clear about, and 
I think we're clear but I want to make sure, had he 
« not smoked, he would be alive with no lung cancer 
expenses in 1980, he would be alive with no lung 
cancer expenses in '81, and he would be alive with 
no lung cancer expenses in '82. That's what you 
wish me to assume. Is that correct? 

Q. I’m just asking you to assume in 1981 
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that he died. 


A. In the actual world because he smoked. 
But we have to — 

Q. I *m not dealing with any counterfactual. 


X think you're trying to impose some counterfactual 


world 


askinl 



out any smoking whatsoever when all I'm 
to do is to compare the costs of smoking 


to nofi-slokers. Okay? That doesn't assume — 


C£ 



So this is causal? 

I'm just making sure you understand that 


ac tua 



►— 


momen 



er had no other medical costs. 
In the actual world? 


Never had any other medical cost in the 
Id and of course he died in 1981. 

That's all I should assume at this 


Yes . 


A. So forget the other things we were 
assuming before. 

Q. No, assume all the other things we were 
assuming before, that he died of cancer, that his 
cancer was related to smoking, that if he had not 
smoked he would not have had medical expenses 
related to cancer. 
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A. That's the counterfactual world. That's 


what counterfactual meant 


Had he not smoked he 


would not have contracted lung cancer, he would 
have had no medical expenses for lung cancer in 

’80, '81 or '82 and he would be alive in '80, '81 




Is that correct? You were willing to buy 


that yss^anption before. 


laipar 
to wh 

pj|in 

^ ™ y o 
Y ou w 
the m 




Yeah, but that's because you've now 




n entire counter factual world aj3 opposed 
e judgment of the medical doctor was. By 
oking caused your lung cancer," it means 
n't smoked you wouldn't have gotten it. 

*t have gotten lung cancer. And that's 
1 doctor saying that, and I'm asking you 


to believfe that to be true. But I’m not asking you 


to be 


s taye 


anything else about whether, had he 
ve, he wouldn't have incurred any medical 


expenses. I'm not asking you to assume that": 

A. So that what happens is had he not 
smoked, he would not have had lung cancer or any 
expenses for lung cancer in 1980, '81 or '82. Is 

that correct? 

Q. For lung cancer, correct. 

A. That's what I was saying, I believe. 
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Q. Now, in year three then -- All right. 

We've got year one, year two, ’80 and *81. Okay? 

A. Right. 

Q. In year three medical science develops a 
cure for myelodysplasia and the non-smoker who had 
myelot^^asia is cured then and has no medical 
expendfitcares in 1982 , year three. Could you tell 


me whi 


the relative cost of smokers versus 


13 ®*.' 



non-ei^&s for the trust fund for the three years 
at is 


Just descriptive? 

Yes . 

So for the smoker it's 10,000 plus zero 
plus For the non-smoker it ' s zero plus 

s zero. And the ratio of those two 
numbei k^p pfc 1. Just descriptive, not causal. 

CfP™i And again in year three add another fact, 
same description, but add another fact, and that is 
the non-smoker has additional medical costs for 
eyeglasses and prescriptions and other expenditures 
of $1,000 related to his eyes, although he's cured 
of this disease — okay? — that are covered by 
the trust fund. 

A. Okay. And I presume you want me to 
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assume that it is completely unrelated to smoking? 
Q. Correct. 

A. Those extra expenses are completely 
unrelated to smoking? 


Q. Correct. Now, in that circumstance. 


then, 


1 



relative expenditures for smokers versus 


non-smokers has gone down in the sense that the 



12 

tj 

TT5 

F®|% 

a 

jU, 

Os 


non- si 
to smt 
1 

ci 

Iff! i c i 



s have now cost $11,000 total as opposed 
who've cost $10,000. Right? 

Purely descriptive, nothing to do with 
inference about the effect of smoking on 
:penditures: Under all those scenarios 


the ex®o«ss cost of smoking causal has remained the 


same , 




one , 


64 ye 


000 more. 

I want you to further assume that year 
both the smoker and the non-smoker were 
>f age . 


A. Okay. 

Q. No change in the hypothetical otherwise. 

Okay ? 

A. Okay. 

Q. That means in year two the smoker and 
non-smoker were 65. 1 want you to assume that. 

A. Okay. 
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Q. I further want you to assume again that 
in year two the smoker dies. Okay? 

A. Okay. 

Q. And the non-smoker goes on Medicare and 
is not covered by the trust fund. 


pQ 1 

4 




Okay. 

For the trust fund in that circumstance 


over tnolne three years, what is the relative 


expeni 




other 


zero, 




again 


p ct 

W7 

Os 

^®?9 

Cl 


zero. 


JXJ 


f treating smokers versus non-smokers? 

Just purely descriptive again? 

Yes . 

It is 10,000 plus zero plus zero and the 
e, the beneficiary expenses are zero, 
o, if I understood it right. So it is 
initely more because it's 10,000 over 
t that's purely descriptive. 

That's good enough, Doctor. Good enough 


A. Just wanted to make sure that.... 

Q. Just want to make sure what? Go ahead; 
finish your sentence. 

A. I just wanted to make sure that 
previously most of the questions were about the 
cause, due to smoking or caused by smoking, and so 
I understand that because -- I’m not arguing this 
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is not good enough at all, but I'm saying the other 
part of our conversation, the deposition, was 
dealing with the causal question. 

Q. Now let me ask you this question. In 
that scenario, what was the relative portion of 


costs 




he trust fund that was caused by smoking? 
MR. BIERSTEKER: Objection, 


incomplete hypothetical. 


be f or 


S3 



jg^ Under the same assumptions that we made 
Ilf presume you're talking also about costs 
Seating lung cancer. I mean costs of 


U V* •• a. 

f¥latJliiipifLung can 


cer due to smoking, caused by 


smoki 



fpr. i Yes. 

. 

A. i So all of the expenses in all those 
scena^psl were due to the existence of smoking. 

Let me ask you this question now. When 
you in your reports talk about a counter factual " 
world, and I can cite it but you probably know it 
better than I do, you talk about a counterfactual 
world of the existence of smoking? starting on page 
17, for instance. 

A. Is the report 1? 

Q. Yes. 
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A. Yes. 


«fi*' J 









oeg^h. 


Q. And you're critiquing in part Dr. Harris 
in this section called The Existence Of Smoking on 
page 17 through, I guess it goes through 21. 
Actually, I'm sorry, it goes beyond that. But at 


least 




t relates to the existence of smoking. 


your (spouiSterfactual world posits that there is no 
smokingTn the world. Correct? 


just 


It’s not my counterfactual world. I am 
■ibing if you are trying to address the 



saZTerxects of smoking relative to a 


□u 


that 





tual world where smoking does not exist, 
t's going on here. It's not mine. 

You say "To estimate the health care 


costs inciirred by the plaintiff trusts as a result 


of th 


compa 


stence of cigarette smoking, one has to 
e plaintiff trusts' health care costs in 


the factual world to what those costs would have 
been in a counterfactual world in which no 
cigarette smoking existed." Correct? 

A. Correct. 

Q. And you did not therefore posit a 
counterfactual world where smoking exists but the 
actions of the defendants' misconduct does not 
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exist. Correct? 

A. Well, I am describing what would happen 
if you want to ask these questions. I'm not 
saying, hey, I like this counterfactua1 world. I’m 
saying let's assume this is the question being 
asked^ j^ t * s not my world. 

©% ^ You're saying that if you were going — 
Well, are assuming, though, that the only 

recove^Uf or damages the trust funds can recover 
are f|PPj Mjhose expenditures for smoking that are 
f^y^ Be CTp y* the defendants' misconduct. Correct? 
rtiea^f^^at's your testimony. You believe that’s 


recov 


are f 




corre 


ef fee 


count 


exist 



iat's your testimony. You believe that' 


That’s what the definition of causal 
ist be and it is in respect to some other 
itual world. Here I'm talking about the 
of smoking and I'm distinguishing that 


from causal effects due to misconduct. 

Q. You understand that causation, legal 
causation in court as allowed by the courts in this 
case is a judgment by the court based upon an 
analysis of the law. Correct? 

A. I will accept that. You're asking me to 

make some assessment of having read a legal 
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V *8* 

-Q i 

ptgg&KgggSl 

_i 2 

dj 

W4 

" 

Q 


document which I have not read. 

Q. And that may differ from your definition 
of causation* Pair enough? 

A. The law's definition may differ from the 
scientific definition. I can accept that* 

And do you have an understanding that the 
law ’ s spdefkLnition often involves policy 


cons i 
exten 
there 

fSfer 


causa 


fferax 




.ions about how far liability should 
i whether — Well, let me just stop 
involves legal judgment and policy 
ions about who should be held liable for 
induct. Fair enough? 

1 don't know. If that's what legal 
has to do with, I will accept your 


repre Bentiit ion that it does. 


usmg 


But the causation terms that you are 
deal with causation in the statistical 


sense or more in the scientific sense than in the 
legal sense. Fair enough? 

A. Statistical, scientific, medical, 
epidemiological, yes. 

Q. But you understand there may be more than 
one cause of a particular disease, for instance. 
Correct? 
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A. That's a different use of the word 
"cause” when we were talking about causal effects 
of something. But I understand what you mean by 


that. 


Q. You understand epidemiologists may 


att r i!b 




more than one thing as the cause of a 


partipullr disease? 


Cor re 


Ph 


cance 



Yes, I understand that. 

Smoking and asbestos cause lung cancer. 

Or do you know? 

I know that there are claims that smoking 
tos are contributing causes to lung 


And in the legal realm you would be 


fami1dar Jwith the fact perhaps that more than one 


cause 


parti 


i bsy concurrent causes may combine to cause a 
LpPS^lfr injury to a person? 

A. 1 understand that from the layman's side. 


I mean the legal side of the layman's sense. 

Q. And you are not here to judge or to tell 
the judge that the only damages that can be awarded 
to the plaintiffs are those that are caused by the 
defendants' misconduct as opposed to the 
defendants' misconduct and other causes. Fair 
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H i 

.. 


kmm 


12 

pggi^gll 

T 3 * 



G« 

Q* 


enough? You're not here to tell the judge that? 

A. No. If the judge wants to award damages 
due to other things besides misconduct of the 
tobacco industry, that's a legal call. 


to do 




And you would not testify that it's wrong 
It or unscientific to do that. Fair 


enoug 


psj 


MR. BIERSTEKERs Objection, compound. 
MR. WITHEY: Strike that, strike 


that. 


ft* 

F 8 ™! 


HEY: 


safer 



wheth 


saf er 


You are not an expert on less risky or 
rettes, are you? 


You have not read the literature on 
ternative nicotine devices are or are not 


r cigarettes. Fair enough? 

A. I have only glanced at little parts of 


Q. You have no opinion, I take it, then, 
whether the introduction of certain kinds of safer 
cigarettes would or would not have reduced disease 
in this country. Is that correct? 

A. No, I have no independent opinion on 
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that. 


H"* 




■3 



I assume you would defer to people who 


actually have some expertise and training in that 
topic to make judgments about that. Fair enough? 
A. Yes, although it is possible I could be 


■3 A. 

helpful^ii 


analyzing data to help inform their 


opini 


cigar 



fBji 


fjL,,? 

iPSP^ 


What data on the development of safer 


have you looked at? 

I have not looked at any data. I said 


Xght be helpful in helping them look at 


they have. That's what 1 do. 


You haven’t done it, so you don’t have 


any o 


., 

req|Av^ 


r esul 


ns on that at this point in time. 


I don't have any opinions on what the 
those data analyses would be. But there 


are many contexts in which people have collected 
data and they turn to me for help in analyzing the 
data to address questions that they formulate. 

Q. Have you made any estimate yourself of 
the effect of defendants' misconduct on 
expenditures of the trust funds? 

A. No, I have not. 
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Q. I take it you will offer no opinions on 
how much those expenditures may have been or may 
not have been. Fair enough? 

A. Well, I don't know. To the extent that 


I may offer opinions about the kinds of analyses 

K, 

that wjam m done and the validity of those analyses, 
I wi 1 I T~ o' yer opinions about that. 

OTj No, I'm not asking about that. 

I undpnjgyand you have criticisms of someone else's 



methoPPP^I'm just talking about whether you 

irB^At Lre going to say the expenditures of the 
istpPirtfds attributable to the defendants’ 
miscomGTct were X. 

EH And I am only saying that it is possible 
that ipomsff analyses I might like more than other 
analy^^f^and I may offer opinions about why this 
analy^Wis better than some other analysis. 

MR. WITHEYs Would you read back the 




CD 

CP 

<r> 


CD 

CP 

to 

C'O 


question to the witness, please. 

(The reporter read the question.) 

Q. X meaning a number. 

A. If there iB an analysis that does come up 
with X that I think is a good analysis, then I may 
support X. 
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Q. Have you seen such an analysis? 

A. Not yet. 

Q. Well, we * re here to take your deposition. 
We need to know all the opinions you’re going to 
reach. Just for the record, if you offer an 


opini< 

j 

deterl 


^ S nn ,. n i , 


at I think the expenditures of X as 
by some other person are accurate, then 


we'reT^ng to say "Well, you didn’t have the 


opinit 


say i 




some 


I r ea 


I gue 


answe 


unf av 




: the time of your deposition, you can’t 
Do you understand that? 

I have no opinions about the value of X. 
opinions about the analyses, so I may say 
r ses I like more than other analyses and 
.ike this analysis and then, therefore, 
like the answer. I will not like the 
:ause the answer is favorable or 
e. I may like the mode of analysis and 


therefore I will like the answer, perhaps. 

Q. But you haven't seen such analysis so you 
have no opinion presently. Fair enough? 

A. Correct. 

Q. Now let me talk to you a little bit about 
multiple imputation. First of all, could you 
define multiple imputation? 
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A. Imputation generally is the idea that 
when you have missing data, there's a piece of 
information, a missing datum that you wish you had, 
imputation describes filling that value in and then 
basically pretending as if it were a real value 


I “J' 


even 


valu 


doin 


prob 

beca 





r 5ft* 

1^ 1 fSel 


t 


¥ XI U«L 


a linos 


unce 





gh it is an imputed value, not a real 
here are some attractive features about 
t for subsequent analyses, but the main 
s it is generally an invalid thing to do 
t doesn't represent the uncertainty, 
g something is known when it's not known 
generally leads to biased results but 
ways leads to underestimation of 
ty. The confidence intervals are much 


narrowerJthan they should be. 




repl 


The idea of multiple imputation is to 
ach missing value by two or more values 




chosen in a particular way that represent the 
uncertainty as to what that value would have been 
had you observed it. And if done properly — 
there's a technical meaning of the word proper -- 
it turns out that straightforward analyses of a 
multiply imputed dataset, a dataset with more than 
one value filled in for each missing value, will 
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0^4 

a 

t > 


lead bo valid inferences: bhab is, poinb estimates 

bhab are essentially unbiased; they're estimating 
bhe righb quantity; confidence inbervals bhab have 
bhe righb coverage; hypobhesis besbs bhab have bhe 


righb level. 




So ib is a way of represenbing and 


adjustein^ for bhe uncerbainby in missing values 


such 


end . 


you can geb bhe righb answer oub ab bhe 


E And would you agree bhab in bhe conbexb 
issing daba is found wibhin a parbicular 
iscipline, bhab will bake some judgmenb 
using^R bools of analysis of bhab school of 
disciJPl»!fcK$3 to determine whab bhe significance of 
bhe missihq daba is in relabionship to bhe overall 


goal 


J3T 


e parbicular analysis? 

Could you clarify what you mean by 


significance? I presume you don’t mean in the 
statistical sense. 

Q. Right, I don’t mean in the statistical 
sense. I mean that it takes some judgment to say, 
well, we’ve got missing data here, let's use our 
judgment as an historian, let's use our judgment as 
a doctor, let's use our judgment as a public health 
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official, let’s use our judgment as an 
epidemiologist, for instance, to determine what 
importance to attach to the missing data and how to 
resolve the issue of missing data. 

A. Input from people who know about the 
conte the dataset is certainly useful. But it 

misguided because they may not 
the consequences of the missing data for 
analyses, and so there are certainly 
>f people doing analyses where they 
ley were doing something that was good and 
out to be bad. 

But it does require, as I understand your 
then, input or judgment from people in 
the d£jy»jypline in which the context of the data is 

beingm^erated? 

LjLgJ 

r™®? MR. BIERSTEKER: Objection, 

mischaracterizes the testimony. 

MR. WITHEY: I'll withdraw the 

question in light of the objection. 

BY MR. WITHEY: 

Q. As I understand your description of 
multiple imputation, then, if you were to have 
missing data, instead of implying or imputating 
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I guess is a better word that the missing data 
number is 80, as an example, just hypothetically, 
you might want to use multiple imputations to say 
what happens if we assume it’s 70 and 90 in 
addition to 80. Is that a gross lay description? 






That *s a gross characterization saying 


J1S|hould do the analysis when it's 70, when 


that 


it's SoTlwhen it's 90. And there's a way, if those 


70, 8 


look 







G 


will 


be oti 


the s 


dete 



d 90 were chosen in the proper way, we can 
hose three answers that result from 
0, 80 and 90 in, we can combine them into 

r where the point estimate, the answer, 
etter for a certain reason than it would 
ise; and moreover the interval estimate, 
ard error that we get, will be valid. 

Does it take any judgment at all to 
whether in a given context, whether it be 




epidemiology, laboratory studies, economics, 
whether the imputed points should be 70 and 90 
versus 60 and 100? That takes some judgment. 
Correct ? 

A. Judgment from the scientific input is 
always helpful. But because the imputation 
procedure is a predictive one, it is much more of a 
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statistical problem rather than a scientific one. 
Not to disparage the scientific input. It's always 
useful. But some experience suggests that 
predictive models don’t require as much science as 
sort of the inferential models. 

What impact does the use of multiple 
input «ti£¥|i have on the width of confidence 
inter^a^ as opposed to a more precise best 
estinu 

It means instead of just imputing some 
and stuffing it in there and believing 
s the truth, what multiple imputation 
,y always do is make the confidence 
wider. Will increase the uncertainty, 
refleat i ml the fact that you don’t know what the 
value 




So the decision whether or not to use 
multiple imputation then depends upon how confident 
you are in the best estimate. Fair enough? 

A. Well, if you are really confident in the 
best estimate so there’s no uncertainty, then it 
wasn't missing, was it? So why did you call it 
missing? 

Q. You can have a number that is an exact 
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flmmsmi 


u 



""Ci 

Q 

ft 


number and you can also have an estimate that is a 
reasonable estimate in the eyes of the researchers 
in a particular field, whether it be economics, 
epidemiology, that feel reasonably confident that 
that is a number or even express that number within 


a ran 


at then justifies, okay, we don't need to 


use mifitUtole imputation. 

LjLJ 

To clarify — 
(Lm Fair enough? 


i 

you do 


going 




<L3 


range 
s tati 



Not really. Expressing it within a range 
the same idea as multiple imputation. 

secondly, if you put in a best value and 
know that that's the right answer, you're 
et the wrong inferential answer. 

So it is better to put it in within a 
mean, your judgment then from a 
1 standpoint is it is better to express 


values, including for missing data, within a range 
rather than just here's our number and nothing on 
either side of it. Fair enough? 

A. Correct. And it has to be done for all 
missing values at the same time. You can't do it 
one at a time and it has to be done in a way that 
they're tied together. So we were describing it 
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just as if there was one missing datum, and that's 
a very simple case, but as you have more and more 
missing values, it becomes more essential to do it 
really correctly. 

Q. Have you done up a chart or are you 
prepare do any chart or diagram that you-ve 
relied* " u3 bn to describe what is the missing data in 
this casesj? 

Well, there has been a chart that was 
done i^Pflinnesota that had NMES as part of it and 
ffsThc-i 


Cr> 

O 

O 



mljsXhg-data problem of NMES as part of it, so 
it fWAi^ct s NMES. In this case with respect to 
the d sts that the plaintiffs are using, which 



are CfS^Tt , NHIS, and the expenditure data, I have 

\ 

act ua U^#hought about doing such a chart, for 
exampbnrifor the expenditure data and the missing 
data JlPPlTie other two datasets. 

Q. Have you drafted anything at all? 

A. No, I haven't drafted anything. 

Q. Do you have any opinions on the issue of 
the question of missing data as it relates to 
CPS-II that you have expressed in any report? 

A. Expressed in any report on CPS-II? 

Q. Yes. 
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A. Have I looked at: the missing data in 
CPS-IX? I'm not sure I have. X mean, one of the 
major problems with CPS-II is it’s not a survey in 
any sense; it is just a collection of volunteers, 
of people. And so in some sense it can be regarded 
as a missing-data problem because there are 

all th es' OS people that you would have had if you 

lo a representative survey and they're all 
co in some formal sense it is one of the 
irst missing-data problems. NHIS has — 

I just asked you about CPS-II. 

Oh, I'm sorry. 

I am going to move to strike the last 
part at answer as nonresponsive. I just asked 

you j f vrJi had expressed any opinions in any 
reporkp^out guote-unquote missing data within 

cps — i 

A. Had I expressed? Okay. 

Q. And your answer was? 

A. I think I referred generally to missing- 
data problems being an issue in all these datasets 
in all these analyses* 

Q. Show me where in your reports in this 
case you refer to the missing-data problem in 
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CPS-II. 

A. Let’s see. Page 4 of the first report, 
item 7, where it is stated "The statistical 
techniques employed in both models and in the 
studies upon which these models rely must be 
re lia nd statistically valid. For instance, 

missiy q S feata must be addressed in an appropriate 
mannery^jand then I go on. 

No, I’m asking do you have an opinion 
that M^nissing data in CPS-II — Do you state 
Bwhi^On this report that the CPS-II contains 
'MsilfMH^ta that is subject to multiple 
imputensXoh? 

No, I do not state that anywhere in the 
report^^^t have a general concern that is stated in 
this that the analysis of any dataset that 

JmJwJ 

has mjrlrfnrcg data must be handled in some valid way. 

Q. What is the missing data in CPS-II? 

A. Well, as I said before, in some formal 
sense it is a gigantic problem because it is not a 
representative survey. It wasn't drawn that way; 
it's a volunteer survey. So you can think of the 
population of the United States as being the full 
dataset that you wish you had and when you take a 


Ui 
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representative survey, you sample items — people, 
not items — individuals or families from this 
giant dataset, and you know exactly the way in 
which you created missing data in the sense of 
people for whom you have no data. You know that. 
You by a probability survey, clustered. 


multip t.al le . whatever you did, and you had weights 
and youjinow how to get back to the population. 


Y ou ki 


probl 


w m4 

way 


ft 


fihow to get rid of that missing-data 


hat you created by doing the survey. 


ACS-II is not a survey. We have no 
wing how to get back to the population of 


the states because it's just they're 

volunfrersrls and they were asked to contact their 
friency^^f And so we have this very large dataset 
that ftp^ot a survey, it's a collection of data, 
and fofSHl 1y it is a subset with all this missing 
data, all these millions and millions of people who 
aren’t in ACS-II, and how do we get back to the 
population from this dataset. It's a horrible 
missing-data problem. 

Q. Does it take an epidemiologist to judge 
and to determine whether the use of data from 
1.2 million people contained within the ACS or 
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»* 

cv 


CPS — 11 study is appropriate to apply to the 
250 million other people in the United States? 

A. Does it take an epidemiologist to 
determine that? 

Q. Yes. 

K, 

Does it take an epidemiologist? The 
impli«at&n is X know if X say yes, it means you 
need anegbidemiologist; if I say no, then it 


doe s n| 
to an| 

Sffltlil 

. i 

medic a 
you'r S 


it is 


to t h 




ike anybody to do it. So I don’t know how 
that question. Can you reformulate it? 

Xs the missing data in CPS-II the 250 
.her people that weren't surveyed or whose 
[formation was not obtained? Is that what 
erring to? 

In a formal sense that's correct, because 
a survey. We don't know how to get back 
mlation. 


Q. Well, they asked questions of people. 
Right? In ACS-II. 

A. Yes, almost all white. 

Q. I didn’t ask you what they were. I said 
did they ask questions of them. I didn't ask 
whether they're white or not. Did they ask 
questions of them? 
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Yes, they asked questions. 

Thank you, thank you. But it wasn't a 


A. 

Q. 

survey? 

A. It wasn’t a probability survey. As a 
survey, in the technical statistical meaning of a 


surve 




is not a survey. It is a collection of 


people- who volunteered. 

Does CPS—II present its results on the 
relatW,!, Jrisk of smoking and specific diseases 
withi ^^r ange of 95 percent confidence interval? 

€™1 Well, they attempt to. But those 

A 


intervals really don't have any formal 


me an i 



4 It doesn't have any formal 


meaning even 


as to^hg| 1.2 million people surveyed or 

quest ? 

i -1 With respect to the 1.2 million people 
the confidence interval is zero. You have the 
answer for those people. There is no confidence 
interval. The confidence interval has to do with 
representing a population. 

Q. And does the Surgeon General report the 
results of CPS-II in Chapter 2 and 3 of the Surgeon 
General's report of 1989? 


LP 

h-» 

VO 

cn 
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^■5 





murmri 

F n , 

Q> 


A. I don't know whether it's those chapters. 
I would have to look at it to know if those are the 
chapters where it’s reported. X apologize. 

I don't know the chapters. 


Q. But the Surgeon General's report does 


repor 


results of the CPS-II data? 

: 1 Certain analyses of the CPS-II data, yes. 

And does it report the results within a 


conf i 




( interval? 

They produce confidence intervals which 
ly invalid. 

Is there anything valid about the CPS-II 


study your standpoint? 


Q . -i Anythin 


MR. BIGRSTEKER: Objection, vague. 


g at all . 


Generff*H report of the CPS-II data, it gives 
indications of the public health importance of what 
■ the risks of smoking might be. 

Q. Might be? 

A. Might be. 

Q. Meaning it's only possible, then, not 
probable ? 

A. The issue has to do with causality again. 


With the analyses in the Surgeon 
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quest 


There are associations that are important public 
health concerns. 

Q. Is there any data that is presented in 
the Surgeon General's report describing or 
collecting the data from CPS-II that you consider 
validu^pi reliable data? 

MR. BIERSTEKER: May X have the 

'ead back? X apologize. 

MR. WITHEY: I'll repeat it. 

'HEY : 

Is there any data presented in the 
ineral's report in discussing CPS-II that 
ier valid or reliable? 

MR. BIERSTEKER: Objection, vague. 

I'm not sure what you mean. I thought 
I jus ;wered that question. 

Answer it again then. I'm sorry if you 
answered it. 

A. The part that I was saying was invalid is 
if it's thought of as representing the U.S. 
population, it's not valid for doing that. The 
confidence intervals are not formally valid 
confidence intervals because there's no probability 
structure underlying the way the data were 
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r—i 
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htjI 

%#7 


collected. It does have validity, certainly, for 
making public health statements about what the 
risks of smoking for certain kinds of diseases 
might well be. 


Q. Does it have validity as it relates to 


the j 


from 


does ? 


leve 1 




Oh 0 


nt of causation or inferring causation 
studies as the Surgeon General's report 


MR. BIERSTEKER: Objection on several 

t is vague. I'm not sure it accurately 


» tesji/hat the Surgeon General's report says. If 
wif»4o show it to us, that's fine. Go ahead 
and answer, if you can. Doctor. 

The question again was what? 


valid 


concl 


Does the data reported on CPS-II have 
nd reliability as it relates to the 
s reached by the Surgeon General that 


certain diseases are caused by smoking, if you have 
expertise on that? 

MR. BIERSTEKER: That's a different 

question, so let me have that one back. I'm sorry. 
BY MR. WITHEY: 

Q. Does the Surgeon General report’s use of 
the CPS-II data and reporting of the CPS-II data 
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have validity and reliability when it comes to the 
conclusions reached by the Surgeon General that 
smoking causes certain diseases? 


MR. BIERSTEKER: Objection, vague and 


compound. 


a 


Causes certain diseases? Depending upon 
"the u y e ^ "cause" here and the way we were talking 
about ’Tcjjefore meaning examples, only under other 
a s s u m]| ui.. g.. : * 1 s does it really address causality. 


as sum? 




imput 


Does the Surgeon General's 1989 report 
fple imputation? 

^ No, it does not. 

Are there any critiques of multiple 
Lta"&iro4i in the statistical literature? 




o. 

u 


cr iti 



Can you describe who has written those 


A. Well, I'm not sure about all of them. 

Q. How many of them are there? 

A. Oh, maybe half a dozen maybe, something 


like that. 


Q. Can you describe the authors? 

A. Well, one is Bob Fay, Census Bureau. 
Q. F-a-y? 
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A. Yes. I mean, very often these criticisms 
are sort of throwaway lines and they say — 

Q. I’m just asking, Doctor, the names of the 
people who have written any criticisms of MI. I'm 
not asking you to describe what those criticisms 
are;? lot asking you to comment on whether you 

consii -hem valid. I'm just asking for the names 

asking for. 

I still want to answer the question, if 
I may pW|.ch is that depending upon what you mean 
crXtlL* 


is al 



<2ycit3Lcism, some of these comments may or may 
b t ic is in s . 





Those that you consider to be raising 
or controversy about the use of MI. 

Or controversy, good. Some of the people 
that controversy say it's too difficult to do 

becaulPW the computing environment and they go on 
to talk about their own method. 

Q. Marne them. 

A. I believe John Rao could be categorized 
that way. 

Q. 

A. 


R-a-u? 

R-a-o. To some extent Bob Fay could be. 


The only general criticisms of it had to do with an 
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issue that could be called superefficiency, that it 
can tend in extreme examples to overcover, to be 
too conservative. 

Q. And the name of the person who said that? 
A. Fay. 


said 




I am not asking you right now for who 
I'm just asking you for the names of 


the people you would put in the category of people 


who rj 
contri 

jSj... 

i sssssasas 



that 


J ohn 


njy^yglr 


you ' r 


[ questions, who believe MI is 
ial or is subject to criticism. Just the 
we can start with that. 

Well, who else has? 

How about John Elting? 

John Elting was one of the discussants of 
in JASA and the last time I spoke to 
it I think he was kind of a fan. So if 
erring to something I haven't read, it's 


possible that he was in earlier years more 
critical. 

Q. He refers to the multiple imputation 
controversy. Is that correct? John Elting. 

A. He may. That may be the way he refers to 
it in this package of JASA articles in 1996. 
Anything that’s new generates controversy, so it 
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was new and some people — 

Q * All right. It's still controversial. 

Fair enough? 

A. Well, my impression is every year less 

so. 

But it is still controversial. Fair 

enouglr? ^ 

ULi 

So is linear regression. 

Is it still controversial? 

As all analyses are, yes, to some extent. 
CS «fe“^All analyses are controversial? 

Any analysis, depending upon how it is 
rtain cases, there's some controversy 
her that was the proper analysis to do. 
But I thought you had indicated that 
becaus^iK^; was new it was controversial. 

That made it especially so, yes. 

Q. "It" being multiple imputation. Correct? 
A. Right. 

Q. Any other people other than Fay, Rao and 
Elting that have raised concerns or questions about 
the use of multiple imputation? 

A. I think that mischaracterizes John Elting 
a little bit, but.... I believe in one of the 
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articles on NMES Sommers makes a comment that it's 
too hard to do and refers to Rao for the reference, 
as a reference to that. In that same package of 
JASA there's a guy, I can't remember his name now, 
who was kind of a fence-sitter on it but comes down 


on th 


e of it's probably the only game in town 


that velidly addresses the general problem. And 


I can 


It ' 11 


gy 

12 

T* 

Ofc 

U 


rfeW 

fX 


his face but I'm blocking on his name. 
i to me. 

Well, if you think of it would you tell 




that 




| Sure. The easy way to do it is to get 

Lge of JASA articles in 1996 and he's one 


of thJS4j»4 C uBsants , 


wijimt 

G* 

XjjmvvvJ 

gL 


Would you agree that the fact that the 


epideiMf|M>gical studies cited in the Surgeon 


Gener 


' 89 report do not use multiple imputation 


does not necessarily for that reason cause those 
studies to be unreliable? 

A. Yes, I would agree with that. Just for 
that reason alone, no. 

Q. Does NMES use multiple imputation? 

I guess you've already answered that. But the 
answer is no. Correct? 
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A. As far as I know it does not. 

MR. WITHEYi Why don't we take a 
five-minute break and then hopefully we’ll be done 
in about forty-five minutes. 

(In recess 3:45 p.m. to 3:51 p.m.) 

’HEY 

Have you ever written any papers or 
any papers on extrasensory perception? 

On extrasensory perception, no. On a 
analysis that can be used in a variety of 
.ncluding that, there was a journal that 
sd in along with Bob Rosenthal, who is a 
st at Harvard? it was sort of tests for 
:ample stuff that could be used for ESP. 
And I j^yy^ik the editor of that journal was a friend 
of Bollni&enthal ' s and he invited us to write a 
paper Bob wanted to go there. 

Q. What is the name of the paper? 

A. I can look in my C.V. and find it. 

Q. Would you do that for me, please? 

A. Do you have the date on it? That would 
help a lot. 

Q. I don’t have a date. (Pause) Let's save 
some time. If you don't mind -- Go ahead. 
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w 



dj 
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A. I must be honing in on it; it can't be 
far away from where 1 am. I'm surprised it's this 
far up, but.... 

Q. If you can find it later and provide a 
copy to Mr. Biersteker, we can have it marked as 


£xhib 


0 * 


Sure. I'm just shocked. The years fly 


by, Ifguefes- 







neur o 



P 


q> 

t i 

Q* 


folio 


That's okay. 

I apologize. I really thought I would 
able to get right to it. 

It should be in your C.V., though? 

Oh, yeah, oh, yeah. It's not 
mcology. 

Now, what is propensity scoring? 

The idea of propensity scoring is the 
You have two groups, for example. 


smokers and non-smokers, and you want to compare 
them on an outcome such as medical expenditures 
after having controlled for a collection of 
background variability, covariates, for example, a 
collection of background variables. And the idea 
of propensity scoring is to estimate — I just 
found it. 
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May I interrupt this question to give 
you the answer to the other question? I apologize. 

It's number 182 on my vitae and it is 
called An Effect Size Estimator For 

Parapsychological Research. It's in the European 
Journ^^f Parapsychology, I guess. And what the 
topic ip wa lSS is if you have -- It also applies to 
testiiTg™cJ>ntexts where there is a multiple choice 
test hat is the probability you're going to 

get t^^^iestion right just by guessing. And in 

JapS^hblogy they often use a different number of 


!ic^^^.ike maybe five choices you have to guess 
from, en choices you have to guess from. So 


some 



CTi 


O'! 


es are five choices, some studies are ten 


choici_ 

An exam like the SAT exam uses five 
guest iidbl typically and if you want to know whether 
people are doing better than randomly choosing, and 
different studies use different numbers of options, 
this is a method for sort of calibrating them all 
down to the same number of options. So it can be 
used in the testing context as well as sort of this 
parapsychology stuff. 

MR. BIERSTEKER: Mike, since that 
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article is in the peer-reviewed literature, do we 
have to get a copy and make it part of the record 
or not? 

MR. WITHEY: If you don’t mind, yes. 

I'm not sure we can get our hands on the European 
Journ Parapsychology, to tell you the truth. 

»uld be great. Thank you, Peter. 

’HEY ; 

Going back, you were talking about 
scoring. 

Right. So it’s a method, first of all, 

ri , 

fW f^i^$i$fig out how far apart these two groups, 

tad 

say, aimers and non-smokers, are with respect to 
this ction, the whole collection of background 

L.-. r 

varia hlesi and not just one of them, to see how much 

overla||M^iere is in the multivariate distributions. 

!lir "ilim inf 

That' Efflra first thing it does. The second thing 
it does, it's a technique that can be used in 
combination with models to adjust for the 
differences in distributions of these background 
characteristics between the two groups. 

So, for example, in the smoker/ 
non-smoker example let's suppose we have background 
variables like age, sex, race, married/not married. 



Ln 

►-» 

d 

n 

cn 
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Os 

u* 


seat belt use/nonseat belt use, et cetera, a full 
collection of background variables and you say, 
gee, I would like to compare like with like. 

I would like to compare smokers with like non- 
smokers to try to make some assessment of health¬ 


care 


smoki 


vana 


as to 




answe 



diture differences that are due to 
confounded by all these background 


What is the significance if you find it 
irticular variable that was selected that 
lo overlap between the two groups, in this 
srs and — 

You stopped me in the middle of the 
it I’ll go on and answer this new question 


if you waJit me to 


Conti 


I'm sorry. I thought you were done, 
ith your answer, then. If you can answer 


the other question, fine. 

A. It flows from what I said earlier so it's 
a perfectly fine question in the context. I didn’t 
mean to say that for that reason. 

If there's no overlap in the 
distribution, then the suggestion is without heroic 
assumptions you really can't compare these two 
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groups 





8 



nd. 



Here's an example. Let's suppose all 


the smokers you have were obtained from some 
dataset and they're age 19 plus and then the 
control group was gotten from the infant ward in 


the h 



al and their ages are from one week to 


sevenpioBths old. Now, you say, well, we'll just 


do anffP adjustment for this using a regression 


model 


£*4 answe: 

I2I 1 - 


3, 3 


tri vi 
don * t 


can m 


the regression model will give you an 
:t will give you an answer possibly with 


[airs attached to it saying highly highly 


Propensity score analysis in this 
se would say, look, these two groups 
lap at all on age and the only way you 
comparison that says something is due to 



smokifftfr^aving adjusted for age, is under some 
heroic assumptions that involve model-based 
extrapolations from one region of the data to the 
other. Which doesn't make a lot of sense. 

Q. Were you finished your answer? Actually, 
you've answered both questions, have you? 

A. I haven't answered the other question 
about how you would use it to help estimate this 
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examp 



effect assuming there is some overlap. 

Q. Okay, go ahead and answer that. 

A. There are several ways in which it can be 
helpful to estimate when there is some overlap. 

To continue this kind of trivial 
|let's suppose the smoking group was as 
1 ore age 19 and over and the non-smoking 

group 1, hac all these infants in it but also had some 

A* 

|o were over 19 years old. In a 
|ate sense, although this example is a 
le-variable sense, but the propensity 
an analogous way of reducing a 
ite space down to one variable called 
scoring. In the age example you would 
irow away all the control people who are 
young^M^fhan 19 years old because they're kind of 
irrellfBIt, they're infants, and then do an 
analysis comparing the over-19 smokers with the 
over-19 non-smokers. 

So propensity analysis scoring takes 
a large set of background variables, twenty, 
thirty, forty, and reduces it down to one so you 
can do this exact same thing using only one 
variable rather than the four-multivariate set. 
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Moreover, it can be used in analyses either to 
match people up and do a match analysis or a 
subclassification analysis where you form 
subclasses based on people who have similar values 
of a propensity score, and it can be used in 

K. 

combigp^on with model-based adjustments to provide 
robust estimates than can be obtained 
ils alone - 

Does it require input from other 

is, in this case, if you're talking about 

id non-smokers, the medical community or 

>gists, as to what might be legitimate 

covarii^s to analyze using propensity scoring? 

it certainly is helpful not only for 

. , . 

choos a-n cftpJfc h e covarxates but also for consideration 
of whp^^interactions or higher-order terms would 
be usHfS in such a propensity score model . 

Q. Let me ask this. As I understood your 
testimony, then, if there is no overlap between 
smokers and non-smokers and some other covariate, 
no matter what that covariate is, then the 
suggestion is you cannot really compare the two 
groups? 

A. Not quite. Without some explicitly 
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stated assumptions. For example, if they don't 
overlap at all on age and you say assuming age is 
irrelevant and they overlap on everything else, 
then you can go forward, but you have to state I'm 
assuming age is irrelevant. 

Irrelevant to the issue of whether there 
are drffSrances in disease distributions between 


smokeFselnd non-smokers, then, in that example? 




medic 




»—\ 


at le 


epide 


Whether age affects in this example, say, 
xpenditures. 

Or disease distribution. Correct? 

Right. 

That could be a subject. Right? 

Sure . 

So you would want to have something that 
in the judgment of the doctors or 
ogists would hypothetically cause a 


disease or be a risk factor for the disease as 
opposed to something that is like totally 
irrelevant and has nothing to do with causing 
disease or being associated with disease? 

A. Absolutely. Although sometimes you have 
to be a little bit careful about that. For 
example, a covariate like length of hair seems 
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a priori to probably be irrelevant to disease, yet 
if somebody forgot to record gender it could be a 
proxy for gender. So you have to be somewhat 
careful sometimes about saying it’s not related to 
it. But basically you have the right idea. You 


only 


facto] 


.i 


to include things that are arguably risk 
potential confounders and their 


inter Sct Tp n s. 


Let me ask you, as an example, let's say 
you'r^^aparing diseases again or expenditures 

S we enf slnokers and non-smokers and you had as a 
Wvarjw^ people who use lighters. 


that 


light 


and n 



Ifeis) 


Okay, so the covariate is an indicator 
you use a lighter or you don't use a 


Right. All of the smokers use a lighter 
f the non-smokers use a lighter. There's 


no overlap. 

A. Correct. 

Q. Then in that circumstance would there be 
a reliable and legitimate comparison between 
smokers’ and non-smokers’ health care expenditures? 

A. If that’s the only reason why they don’t 
overlap, sure. Because you've just stated 
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explicitly the assumption that I don’t believe 
using a lighter or not is a risk factor for this 
disease. In fact it could be, if the gases that 
come out of the lighter are the real cause of the 
disease and they have nothing to do with cigarette 
smoki ^^ But that's an assumption that would be 
stated* " a M i it could be very plausible, very 
reasonabTje to proceed under that assumption. 

Have you identified what covariate risk 
facto^^suld be appropriate to use in comparing 
eaJecHLstribution between smokers and 




•HWU.JSUts? 

What we have done is look at the 
co 11 e(Ftiretis of variables that have been used or 
proposetLJp o be used in the plaintiffs' expert 
reporl^H^id testimony in the various cases. We 
have artnWl of taken the attitude in the case that if 
plaintiffs say these are variables that we have to 
adjust for, then it is quite arguable that we 
should be adjusting for them. 

Q. And do you have a list of those? Have 

you yourself kind of formed a list of those? 

A. Well, let me see. In this report, which 
you have the disks for, and the supplemental 
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—i 2 

W3-. 



Os 



report, there are propensity score analyses where 
we explicitly say which covariate sets we use. 

I think in at least some of them we use a fairly 
complete list that is from a Harrison report in 
Oklahoma. But there are other variables beyond 


that 


some of plaintiffs' experts say should be 


adju s-red'^’lf or, I believe, that are maybe not in that 


list. 


to co 




Schuma 


not g 





about 



Have you read the Batelle report related 
iding variables in the CPS-II data? 

I've read parts of it several times and 
red through most of it. This is by 
is that right, Shulman? I’m sorry, I'm 
ig the name right. Do you have the name? 
I don't. I could get it, but.... 

I just want to make sure we're talking 
same report, that's all. 1997 maybe? 


Q. Yes. 

A. Shulman et al., I think. 

Q. In your reports in this case have you 
made any criticism of the Batelle analysis? 

A. I haven't made any comment on it in the 
reports. 

Q. Have you indicated that as one of your 
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pmmm 

C ^ 5 


reliance documents? 

A. I don't know if I have but I think the 
plaintiffs may have. 

Q. I'm just asking about you. 

A. I don't think I have but I'm not quite 


sure . ; 




Q i 


y : It is not one of the documents you cited 

as reuecn upon in forming any opinions? 

I don’t believe so. 

And you haven’t expressed any criticisms 

nl 

ill any of your reports? 
lif Any of the written reports? 


Yes . 




Yes . 




I mmm 

» . 


you k 


Well, if it's not in the written reports, 
.. We would've expected it to be there. 
Let me ask you this. Does the 


Surgeon General's report use propensity scoring? 
A. No, it does not. 

Q. Do you know of any study of smoking and 
human health that uses propensity scoring? 

A. Do I know of any — ? I don't believe 


Q. Does NMES use propensity scoring? 
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A. It. is not the sort of thing they would 
use. I mean, NME5 is a database, so a database 
sort of doesn't use an analysis. Propensity 
scoring is involved typically in an analysis to try 
to reach a conclusion, a point estimate, a 

K 

confi ^^ e interval, and so forth. And NMES is a 
dat assert^ so ... . 

Does the Agency for Health Care Policy 
use p w& nsitv scoring? 

pSH*8**^ 

S I don't know. They may, either the 

r they may support work where people do. 
Did you testify in Minnesota at least a 
year sPglHthat the Agency for Health Care Policy to 
your finrrcrt^ 1 edge did not use propensity scoring? 

A ., i I believe I did so. And the reason the 
answe^Urt^ a little bit different now is that the 
depaifWSt of health care policy at Harvard medical 


your 


answ 


depaifflrelt of health care policy at Harvard medical 
school, which I believe does get funding from that 
agency through some of the faculty, there are 
several faculty and at least one or two theses that 
are using propensity scoring. So it could well be 
that the agency is supporting the use of it now, 
which would not have been true, at least I wasn't 
aware of it, perhaps a year ago or whenever the 
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Minnesota trial was, approximately a year ago. 

Q. So you're talking about the Harvard 
department of health care policy? 

A. Yes, I am. 


Q. And that is an agency here at Harvard or 


in Bo 


yM 


at Harvard? 

No, it is a department in the medical 


school at] Harvard. 




| Is it a U.S. agency? 

| No. But I think — Let me repeat what 


aicUin' my answer 


d^ n in 


heard what you said. I'm just trying 


to finQxgut if it's an agency of the United States 


Goveri 


: inn eirt 


No, Harvard University is not an agency 


6 of th#ppHS. Government to my knowledge. 


That's all I asked. 


A. But - 


Is the Agency for Health Care Policy a 


U.S. agency ? 


A. Yes. 


22 Q. Do you know of any U.S. Government health 

23 agency that uses propensity scoring? 

24 MR. BIERSTEKER: Objection to form, 
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vague. 


A. Any U.S. agency? 


Any U.S. Government health agency. In 


other words, the Department of Human Resources, the 


National Institute of — what’s it called? 

, . _ .. . 




National Institutes of Health? 


National Institutes of Health. 


are u 


I believe there are people at NIH that 


it. I know people at GAO have used it. 


Goveri 


Accounting Office, in evaluating cancer. 


e1 iTv?people at NCI, National Cancer Institute, 


not positive about those. I am positive 


about ’ as B3Tin GAO . 


How about the Census Bureau, does that 


use x 


the k 



Don't know why they would, for collecting 


f data they do. Well, yes, they have 


done some things related to it, but it is not the 


same kind of application that we’re talking about 


here, so even if they have it’s not really germane. 


Q. Maybe you could name three medical 


journals that you consider to be authoritative on 


general-issue topics of medicine and science. 


A. I am not a great one but I'll tell you 
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H - 

, 





isSsf 



Qc 



what I know, mostly as a layperson: New England 

Journal of Medicine. 

Q. That’s a good one. 

A. Lancet; Journal of the American Medical 
Association - 


Fair enough. Lancet is a British 


journHT*,l Correct? — 

If‘™l Annals of Internal Medicine. 
mtfwrf No, that's all right. Three's fine. 

Now, are you aware of any articles 
jlSb liin any of those three journals that 
llttli^^^propensity scoring at any time in the last 
thirt^^^ars ? 


JT- 


Yes . 

Can you name one? 

One of which in JAMA I was a co-author 


Q. No, I'm sorry. I should have asked it 
other than yourself. Other than articles you've 
published or participated in publishing, any other 
articles that use propensity scoring. 

A. I believe so. 

Q . Name any. 

A. I think there was an article by Connors a 
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couple years ago in JAMA. I think there are 
several articles that in recent years have used it 
I believe in JAMA. I don’t know about the New 
England journal. And I certainly don't know about 


Lancet 






5*i 


wrote 


publi 


C onno 




® Ulnl 



What was the article about that Connors 


And, again. I'm not positive that it was 
in JAMA. I know the article was by 
id I'm almost positive he used propensity 
It was on, oh, I can't.... I believe it 
ting to do with maybe catheterization 


after o^on ary bypass surgery and trying to adjust 



for dJ 


and p 


prope 


rrreri 

L.— 


analy 
1985 . 


rences between people who had been cath 1 d 
t who had not been cath'd, and they did a 
r scoring thing including a sensitivity 
tutlined in a paper I did with somebody in 


Wait. Sharon Norman, who is at the 


department of health care policy in Harvard, we 
were just at a session together and she have giving 
a talk on some analyses that she had done also 
using propensity scoring in a health care database. 
I don't know whether it's been published yet or 
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not 


I’m just talking about articles published 


in the New England Journal of Medicine, Lancet, or 
JAMA that were published and presented data or 
analysis that utilized propensity scoring. 

|^g|gllf| If I could have a copy of the article 
that ' r s^ewhere in the Annals of Internal Medicine 


articfe^j there's a long list of articles that 


QC 


and 11 


appeaWWin journals up to about three years ago 
and tpSjji^ were some which I don’t remember that 

appeared in JAMA beyond the Connors one 
Witch wfalhink appeared in JAMA that appeared in 



Ihink appeared in JAMA that appeared in 
leal journals, and certainly that could be 
ted. I just haven’t taken the time to 
bp of all the new publications that use 



other 


supp 1 
keep 


pssMSi If you'd like me to try to collect 

together publications in medical journals that use 
propensity scoring, I could try to do so if that 
would be helpful. 

Q. Let's just take those three. Where does 
Connors teach or practice? 

A. I don’t know. 

Q. Do you know his first name? 
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TA 
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A. No, I don’t. I know there was a follow¬ 
up article that came from Seattle, Washington- 
Seattle, that I believe was published in — I don't 
know — maybe JAMA, maybe the New England journal. 
There was a discussion, I thought, of his article. 


But t 


not si 


ay be wrong. I should retract that. I’m 
I know there was an article that was 


published] that had discussion of it, but I don't 


know 


stop 


m 


er it's published or where. I should 



that e 


in th 


pr ope 


provi 




Do you know who in Seattle did it? 

I can't recall right now. Again, I have 
y accessible, but I just can't recall. 

If you could find any articles published 
hree journals you've listed that utilize 
scoring, I would appreciate if you could 
em to Mr. Biersteker, and it will be 


marked as Exhibit 7, our next exhibit. 

Have there been any articles critical 
of propensity scoring that have appeared in any of 
the peer-reviewed literature? 

A. There must be. Do I recall any of them 
specifically is what I'm struggling with. 

I don't -- I can't recall any specifically, but 
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I’m sure there must be. 

Q. Why do you say that? 

A. Well, it's an idea that is fairly new. 
Paul Rosenbaum and I wrote about it in an article 
that was published in 1983. And at first when you 
descr^^the idea to people, that you can take a 
collejgP ti lln of fifty variables and reduce it down to 
one variable and get all the bias-reducing action 
out ofeJIg^at one variable, it sounds implausible. 

It soMil almost magical. After describing it, it 

iKmf 

MfisnUanymore, but.... So an idea like that has 
“ ge|W|te some discussion, I guess. 

{JvZTj Including concern, criticism, et cetera? 

itT 1 .^ Well, yeah. I mean, there must be 

discu somewhere about ways of misusing it and 

how ipM^n be misapplied. In fact, the question 
that ^Pwere asking before was a good question, 
about the lighter. If you put in an irrelevant 
variable, the whole thing kind of falls apart. So 
you don’t want to put in irrelevant variables. So 
is that a criticism of the method? Well, I don’t 
know. It's a criticism of how it can be misused or 
something like that. But it’s a good comment. You 
shouldn't put in garbage variables because they'll 


VO 

in 

cr> 
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just make it not work right. 

Q. At any rate, you can't think of any 
articles ? 

A. That are explicitly critical of it? 

Q • Right; or that would say, well, it is 
subje^l^o misuse, it is subject to — 

I’ve written that. 

You have? 

Yes, that you have to^be careful about 
how il flused. And if you put in irrelevant 
ial 3 8’ it doesn't help, it hurts. 

It is fair to say it is somewhat 
controve r sial still as a new idea. Fair enough? 

H MR. BIERSTEKER; Object to form. 

I don't think it is really 
contr^Wsial . I mean, at least if you're 
c omparTnfp that with sort of the multiple imputation 
controversy, I think it is less controversial than 
multiple imputation. 

Q. It is not a method that has been utilized 
by an extensive number of researchers. Fair 
enough? 

A. I don't think that's true anymore. That 
was probably truer -- I mean, I'm just thinking. 
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in the last year I've been invited -- I don’t 
know — three, four, five times to give 
presentations. Just last week I was invited by 
Merck, I guess outside Philadelphia, where the 
whole day was devoted to using propensity scoring; 


about 


pharm 


pr ope 


Ex] 

ATJ 

Oil 


expen 


ie talks. 



T% Who sponsored that? 

I think it was Merck? you know, the 
ical company. And it was all devoted to 
Wy scoring, observational studies and the 



rbpensity scoring methods. I think there 
or five talks on real applications using 


Were you paid for that? 

I was given an honorarium and travel 


How much was the honorarium? 


A. Haven't been paid yet. But how much was 
promised? 

Q. Yes. 

A. $1500. 

Q. For how long? 

A. Well, I only had to be there for an hour 
to give my talk, but I stayed for the whole 
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sessions. 


Q. Who paid you or was supposed to pay you? 
A. Merck. 

Q. Have you been given an honorarium or paid 
by any other company in a non-academic sense — 


I me 


j 


pres.; 


a corporation, an industry -- for 
g your opinions or work or research 


relat'edjjo either propensity scoring or multiple 


imputi 



paid. 


c ompai 
it ini 



Yes . 

Could you name those? 

Not all of them. 

Name what you can and how much you were 


The most recent one had to do with a 
hose name I'm blocking on right now, but 
ed an FDA submission and they wanted to 


put together a committee of experts to help them 
with their decision on what to do next. 

Q. What kind of product? 

A. It was a drug. 

Q. What kind of drug? 

A. I don't remember exactly and if I could, 
I probably..,. It was a pharmaceutical that went 
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r_j 


X. 4C 



ffjL 



through Phase 3 submission for FDA. I don't think 
I should tell what it was anyway, if I could 
remember exactly what it was. 

Q. How much were you paid? 

A. $1200 an hour. 


jppp% 

PiM 

9 



How many hours did you work on that 


pr 03 e 


be for 


been 


& 



From conference calls and a meeting 
, could have been fifteen, could have 


1500 or a thousand? 

ljP% 

I# jppm| How many hours7 Maybe 15 hours. 

tpST*/ I’m sorry; hours. 

Jp”H Maybe 15 hours. It may have been ten, it 
may h een twenty. I don't know. I was reading 

docuni|fS^ and two long conference calls. 

What other companies? 

A. This is just generally advice, whether 
it 1 s been either propensity scoring or multiple 
imputation, right? 

Q. Yes. 

A. Amgen. 


may h 


doc urn 


ft,_ , 


Q. For what purpose and how much were you 


paid? 
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H 


*"9 

C£» 


V 



©a 

u> 

Ql, 

r* fiBS 52^ 


A. There were two different things. One was 
involving also an FDA type project from a couple 
years ago and then more recently with Amgen there 
was something involving a sort of pending 
arbitration where I was giving some advice on 


sampl 


and d 


I don 


scor i 


in th 









recen 


In that context issues of missing data 
rences in distributions arose, although 
bow how specific that was to propensity 
ind multiple imputation stuff, but it was ^ 

jckground at least. 

! 

! How much were you paid by Amgen? 

| At the Bane rate, 1200 an hour. 

^ How many hours? 

I That was maybe ten. 

| For both? 

I No. I’m sorry. The ten was for the 
le on the pending litigation or pending 


arbitration. 

Q. And the other one? 

A. The other one was a few years ago. 

Q. And how many hours did you work on that 
project a few years ago for the FDA type project? 

A. That was probably two years ago and that 
was a fixed-cost project where I actually got 
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somebody to do some computing and some programming. 
It involved multiple imputation of missing data in 
pharmaceutical trials. 

Q. How much were you paid? 

A. Well, my memory of the flat rate for the 
contr yM* as $25,000. I think I paid — I don’t 
rememlfer'liow much I paid my colleague who did the 
compuiTrngJ for that. 

Any other companies that you've consulted 



> Ih s m I 

y | 4 

>«M 


with? 

g*j 

cofnp £L 



Do government agencies count as 

H i 

Ipai 

Sure. fy -- 

National Center for Health Statistics, 
total]p^|^volved in multiply imputing NHANES. That 
was a pfliisiect that continued over several years. 
Censu^Wleau, we have had projects. 

Q. How much have you made over several years 
for NHANES? 

A. Well, that project involved my partner 
and his corporation. Rod Little and me. I don't 
think anyone else was paid out of that. I don't 
know how much. Maybe it averaged out to $10,000 a 
year for four or five years. But, again, that 
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was..•. 


U- 





pQ 1 



ns 

Q 

f_^ 

Pry 

fS 

P^CrV®? 


Q. Any other corporations or governmental 
agencies that you've been paid to consult with on 
MI or — 

A. Department of Transportation, National 
Hiqhw ^M baf f ic Safety Administration. The problem 
there TOaS purely to multiply impute FARS, the fatal 
accidenFjreporting system, where the" critical 


nussn 





conta 



all p 
this 


to ma 


*- 


iriahla was blood alcohol content. 

You’re scowling as if — 

No, no. Did you work with Wecker at all 


No. That was several years ago? The 
ere at NHTSA was Terry Klein. That was 
multiple imputation, to multiply impute 
dataset. It's the dataset that is used 
1 the judgments about seat belt use and 


airbags, and that’s the dataset where you find out 
airbags kill little babies sometimes when it goes 
the wrong way, in a 1O-mile-an-hour accident takes 
off their head. 

Q. Was that specifically on airbags? 

A. “”lo, no. The whole problem was to 
multiply impute blood alcohol content, the blood 
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alcohol content which is missing about half the 
time in this giant database. And it is critical in 
order to make some assessment of whether you should 
lower the blood alcohol level from .1 percent to 
.08 percent, which varies by state. Before we did 
this had no way of judging what gradations 

there spwelsp because they only had three categories 


of bl< 


I apo 


jEfa, 

CHke 

work 




with 


alcohol content in the big database. 

I just remember that. X mean, 
ze. I just remember the airbag with the 
acause that sets the time for me, at the 
issue was coming out. 

Okay. How much were you paid for your 
NHTSA? 

I don't remember. I really don’t. 

•JVK- 

Any other corporations you've consulted 
you were paid for this? 

For government agencies? 

Yes. " v- 

X mentioned Census Bureau? 

Yes. 

That's been going on for years. 

And which one was that? 


A. Which one what? I'm sorry. 


JONES FRITZ & SHEEHAN 



l!ittp://legacyJibrary.ucsf.e3lfl)^d/pob|t]l^aOQ^aGW , .industrydocuments.ucsf.edu/docs/xygl0001 


5X956 9642 






3 03 


Donald B. Rubin, Ph.D 


H i 






12 

Q 4 



jmL 

Qi 


Q. Was it multiple imputation? 

A. Multiple imputation, yes. Multiple 
imputation in various of their surveys, in general 
to worry about undercount issues. Became a Supreme 
Court issue; just went up and came down. 

Department of Energy a few years ago, 
I did|rso%|e stuff for them. That was survey work. 


I don! 


imputi 


: -C '4: 




(ink there was anything on mu ltiple 
t there or propensity scoring. 

That's all I'm asking about. 

That's right. I’m just trying to cover 


Les and see. There are others as well. 


I belJSe^. Well, GAO, Government Accounting 


Of f ic 


tat was a few years ago; it was on 


prope |s^y scoring. 

a What was that context? 

That was a context in which they were 
concerned with NCI's recommendation on the 
- treatment of breast cancer. NCI’s recommendation 
was based on six very large randomized studies and 
the issue was whether those recommendations would 
transfer to the general population where the 
treatment and follow-up was not like it is in 
randomized experiments. 
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Typically the women are different, 
women who've volunteered. They have a coin toss to 
determine what treatment they will take, whether it 
will be mastectomy or breast conservation surgery, 
in women and doctors who don’t want to be 
r ando . So there was a large observational 

datastf t used and we used proper scoring to 

adjuslror background differences between women and 
doctol|Mio opted for breast conservation versus 
those opted for mastectomy, their background 

BfeScfes and region of the country, age, I think 

A 

^rtri^^4t»tus and some other things, I think 
race. we adjusted for those differences to see 

how tsuits compared, how the recommendations 
that v pyy^ come from those results compared with 
those NCI was making based on their large 

k . L,j 

randonWH experiments . 

Q. What was the result of that work? 

A. It was kind of interesting. The overall 
results kind of confirmed in general the results 
from NCI, which were that for this group of women 
who fit these criteria, which is tumor smaller than 
a certain size, I think 2 centimeters or less and 
no negative, and some other criteria, that probably 
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there wasn't much difference in five-year survival 
between mastectomy and breast conservation; 
although there was a slight indication in the 
observational dataset, slight, that maybe the women 
and doctors were making wise choices. The overall 

a 

survi-vaterirate in the general population wasn’t as 
good p£&J1§he women and doctors in the randomized 
experiments. Which was anticipated 

I’ve done some stuff I don’t remember, 
gettiiPi^lfaid for. I've done some volunteer stuff 

f! 

Let-me see if there's anything else 
I wan ’to ask. (Pause) 

I've done pieces of courses like lectures 
partipyjy|||ting in some courses for an organization 
that |^Mft Georgetown University called Centers for 
Drug freWil opment Science where we went off and gave 
a course at Searle in Chicago and one at Alza in 
California, which are both pharmaceutical 
companies. And that involved some aspects of 
multiple imputation. 

Oh, the Post Office. Let's see, did 
that involve — ? No, we didn't end up doing 

either of those things, so no. 
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MR. WITHEY: Let me review my notes. 




pJOi 1 

^3 

fljl 

W4 

CJ 


mart 

Q 

xjgfia gfa. 


I’m almost done here. 


(Pause) 


that 


want me 


A. Done some stuff for UCLA. 

Q. I said other than academia. 


Oh, I’m sorry. Well, this was a hospital 
filiated with UCLA. I don’t know if you 
b throw that out or not. 





on whi 



W 4 


You don't want to hear that? Okay. 

It’s not what I’m asking. 

Do short courses on multiple imputation 
was paid for help? 

That you were paid for? 

Yes . 

For whom? 

American Statistical Association. I did 


one jointly with Joe Schafer at the summer 
meetings. I've done, I think, two in the 
Netherlands which I was paid for last year. 

Q. As 1 read your supplemental report. 

Exhibit 2, I see you've utilized in some cases the 

-'-***■ - 

32-covariate analysis selected by one Dr. Harrison 
in the Oklahoma AG case. That's found on page 6. 
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12 
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mm^k 
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C2X 

3^4 

ffi-i 
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You used the covariate analysis developed by the 
Cambridge team in the Massachusetts Attorney 
General litigation that is referred to on page 7. 
And then you used the 43 variables in the Harrison 
report in Oklahoma and the 40 covariates from the 


Cambr 



team's June 15 Massachusetts report. 


referyed“Uo on page 10, for the propensity scores 


ana ly 


Cor re 


m 




hat you have given in this cai 


What do you mean? That are reported on 


Yes. Or are reported on page 7 and 9. 
Let's see. What we did is, the Cambridge 


!'t use NMES and Harrison didn’t use NH1S. 


team 


So the most recent analysis we had from plaintiffs 


for N 


analy 


which had their variables, was Harrison's 
In NMES. So in some sense that was 


considered the plaintiffs* most recent in some 
general sense thoughts on what the collection of 
covariates should be, so we just used it for NMES. 

When you move to NHIS, the same 
variables that are defined for NMES aren’t defined 

•oifev'W'--' 

in NHIS, so we went to the most recent plaintiffs' 
report that used NHIS, which was the Cambridge 
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> 1 * 

pQ i 

l? 

ns 

cu 

U s 


©> 
Q. 


team's report, and used what they said were 
important covariates to use. 

Q. You understand that the trust funds have 
not employed either the Cambridge team or 
Dr. Harrison to give opinions in this case. 


Corre 



have 


ident. 


Yes, I understand that. 

And have you intended to determine or 
iel ask if Dr. Dement or Dr. Harris could 
iny of the covariates that they believe 



Id ^relevant to their use of the NHIS data? 


of tha 



I’m not sure X followed the full thrust 
Have I asked — ? Would you reread 




that? FtHpologize, 

(hi You've talked about the covariates used 
by Haitaiilim and by the Cambridge team. 

Correct. 

Q. I'm asking you whether you know what 
> covariates Dr. Dement or Dr. Harris would consider 
important, if any, to determine the reliability of 
the National Health Institute survey data. 

A. No, I don't. But the whole point of this 
report is you have to be inclusive or otherwise you 
can't have faith in the analysis. So my example of 


JONES FRITZ & SHEEHAN 


http://legacyJibrary.ucsf.ed8Aifli/(pi©i#?|^0)(|SwWA/.industrydocuments.ucsf.edu/docs/xygl0001 




Donald B. Rubin, Ph.D. 
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%gF7 



being inclusive was to include things that other 
plaintiffs thought were important. 

Q. But you don’t know — I'm just asking 
you, do you know what Dement considered important? 
That’s all I’m asking. I didn’t ask anything else. 




about 


in th 
litig 


No, I do not know what’s in his mind 
he thinks is important. 

You know he gave a couple of depositions 
hington and in the Ohio trust fund 



I ’ ve 


ago, 


Mm-hmm, I do know that. 

JZLat 

GPP**1 Have you read those? 

I read them for Northwest. I believe 
parts of them for Ohio but it was a while 
ao ma ybe I haven't read the depositions 


recen 


Qnr™™? Bow much have you earned in consulting as 
an expert witness in the tobacco industry Bince you 
first began consulting on AG cases? How much have 
you been paid? 

A. I don’t know. I would have to go back 
and look. I don't have that right here. 

Q. Give me a rough estimate. You must have 
some idea. Professor. 
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A. Probably a few hundred thousand dollars. 
Q. A few hundred, like less than five and 
more than a hundred thousand? 

A. That's my guess, yes. But that might be 
wrong. 

pH You’ve used the term incidence or 
lifetpnelincidence-based assessments of costs, have 
you nbtTj 

I don’t believe so. 

Then I'm thinking of McCall. I'm getting 
o ; jXtfay ' s deposition. 

MR. BIERSTEKERj You are. 

BY MR>^THEY : 

Br-H What is the significance to the numbers 

Sr.. 3 

you ' v e csJ nerated in let's say Table 1 — 

This is the supplement, right? 





Yes. — Table 1 on page 7 of what you're 
calling "smoking group" under the NHIS as opposed 
to NMES of both the bias and variance ratio? What 
do those numbers mean? 

A. Can we focus on the first row and then 
describe that? 

Q. Sure. 

A. Good. So what that first row is saying 
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is, NMES defines two groups. One is current 
smokers and the other is never smokers. So the 
title of the table says "propensity score analyses 
for smokers versus never smokers," and there are 
two kinds of smokers; there’s current and there's 


forme 
in NM 


there 


c an f 


first 


Mud 

iirjmrifs 

r™*™T 


mmiz. +. 

crra t 



o the first row refers to current smokers 
rsus never smokers in NMES. Okay? And 
some number of variables here which we 
out by adding up the numbers across the 
which is 22 plus 6 is 28 plus 4, 32 

looks like, 32 variables, 32 covariates 
describing to discriminate between 


curreiH^piokers and never smokers. Okay? And when 


we do 


, using this propensity score method we 


find thatjthere is one variable which is called the 


prope 


vana 



^ score that is a combination of these 32 
I that exhibits the biggest biaB. And that 


is the direction we really have to worry about, 
-because there's a difference. 

Now, what we find is that the bias is 
.81 of a standard deviation, which is quite 
substantial. The groups are quite far apart. 
Although the variance in the propensity score, the 
spread of those distributions is pretty much the 
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pom 


same. The .81 is relatively bad news for doing 
adjustments comparing current smokers and never 
smokers, but the .97 is relatively good news 
because they’re both spread out about the same 
amount. So we're not in terrible shape there. But 


we do 


to be very worried about linear modeling 


a d j u s tenefst s or standard adjustments that don’t 


worry 


right 



remain 


def in 



1 

mrreitt sji 


the v 


good 


it that difference. 

Then these other numbers across the 
saying if you look in the direction, the 
word is orthogonal to, but it’s after 
for the propensity score of these 
variables, they have no bias by 
i; they have the same average for both 
lokers and never smokers. And here are 
ice ratios for them. And that's fairly 
because the variance ratios, most of them 


are within a fairly tight bound of four-fifths to 
five-fourths for being able to do it if you did it 
correctly. It's sort of bad news for trusting a 
simple linear regression or a logistic regression 
analysis without coupling it with some propensity 
score adjustment as well. And then it marches down 
to the different subsets in NMES. And then the 
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second row, which is for former smokers versus 
never smokers in NMES, the meaning of those numbers 
is the s ame. 

Then we go to NHIS and because in 
NHIS I guess the analyses that were done by the 
Cambr^^ team did analyze separately for males and 
femalf^J'Ve just did it separately for males and 
femalesTj So the third row is current female 
Bmokel^^rsus never female smokers in NHIS, and 
alonglip^ propensity score there's a .6 of the 
leviation bias but a variance ratio of 
Which is again not great news because it 
:an't rely on simple modeling adjustments 
confidence, but it's not so bad as to say 
throw the dataset out. There's no way 
■ o throw the dataset out. 

And it's a similar story as you march 
down for each of these subgroups. 

Q. Let me just ask you this. For variance 
ratio, the closer you are to zero is the best? 

A. No, the closer you are to 1. 

Q. To 1, I mean. 

A. Because it's saying the groups at least 
had the same spread. 
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p— 


. 1,2 

‘ Q3 



TS 

Q 



Q. And the propensity score, on the bias you 
want it closer to 1 as well? 

A. You want it closer to zero. Because 
that’s saying the groups — If the bias were zero 
and variance ratio were 1, the groups would be 


sitti 


trust 


.ght on top of each other and you could 
model and say, boy, Bmokers, non-smokers. 


same average age, same average marriage status. 


same 



cess at 


the e 



S3 


^taking behavior, same everything. 

^ Have you published in any literature on 
ctiveness of smoking cessation programs? 

Have I published on the effect of smoking 


programs? No. 

Have you yourself conducted research into 
iveness of smoking cessation programs? 


Have you yourself conducted any research 


into the reasons people quit smoking? 

A. Reasons people quit smoking? No. 

I mean, I don't do research in that area. The only 
reason I was hesitating is wondering whether I've 
worked with somebody who's doing that, but I don't 
think I have, at least in helping them do their 
research on that topic. 
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Q. 

A. 

are e 
of Dej 


Donald B. Rubin, Ph.D. 

Q. Have you reviewed any documents in 
preparation for your deposition today? 

A. Oh, yes. 

Can you name which ones you've looked at? 
I’ve reviewed my reports, two things that 
ts. I read and reread especially parts 
s reports and Roberts’ report. I reread 
the B^TeTJ-e report, parts of it. I reread parts 
again p^|ome of Harris' — All of Harris’ reports 

i_i 

ts of. I read Harris’s recent 
, not the most recent but the December 
eve it was, deposition. I read through 
ir background articles. 

4 What background articles? 

There were some articles that 1 read 
befor^Hpi smoking rates for blue collar versus 
white fWRar workers and how they've changed 
currently from ten years ago and how the blue 
collar population has apparently changed much less 
than the white collar population in terms of 
modifying their smoking behavior in recent years; 
they just stay high. 

I spent time extending this 
proceedings article on what does it mean to 
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estimate the causal effects of smoking, just to add 
pieces of drafts of these other sections that we 
talked about early in the day, and made some notes 
on how to write some parts more clearly than 
1 thought 1 had in the document that was attached 
to my^jafst report here. 


thing 

examp 

Minne 



Harr i 


obvio 


throu 


part i 



I glanced through other pieces of 
at I thought might be relevant. For 
I did glance through some of my reports in 
, just to see what I was doing so I hadn't 
brgotten it. And Oklahoma, I believe. 

Anything else? 

I think I looked at some documents of 
from other litigation quickly. I mean, 

\ not read it all, but I tried to glance 
[:> see if anything caught my eye that was 
Lly important. 


Q. How, other than the opinions you've 
• expressed in this deposition and the Washington 
Northwest Laborers deposition that I deposed you on 
and you've stated in your report, do you have any 
other opinions or conclusions that you have reached 
based on your review of the documents and materials 
you relied upon that you have at this time? Again, 
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I’m not asking about what future work you might do; 
just any other opinions. 

A. Well, I think something that didn’t come 
across in the reports which you began to touch on 
was this concern with missing data in the Dement 


and R 


s reports, and there is a general comment 


about FUO ut I didn't go into any detail. And 


cleari 


in t hi 


into 


tere is an enormous missing-data problem 
>erts report as well that we didn't go 
letail on, in the sense that data in 


fiTi fefund years is mostly not there. It was 


rjynt 8 


very s 


uncer 


I gen 
but i 




est value imputation methods that are a 
istic model with no reflection of 
y. That's implicit in the report. But 
y criticize how they handle missing data, 


ot explicit. So that might be regarded 


as anonret opinion. 

Q. Have you conducted any other multiple 
imputation analysis on any of that missing data? 

A. No. 

Q. Do you intend to? 

A. That would be a bear of a problem because 
so little is observed. 

MR. WITHEY: I don't have any further 
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RRATA SHEET HANDLING/DISTRIBUTION 


al of the Errata Sheet has been delivered 
. Biersteker, Esq. When the Errata Sheet 
ompleted by the deponent and signed, a 
of should be delivered to each party of 
the original thereof delivered to 


Michael E. Withey, Esq., to whom the original 
deposition transcript was delivered. 


PLEASE REPLACE THIS PAGE OF THE TRANSCRIPT 
WITH THE COMPLETED AND SIGNED ERRATA SHEET 
WHEN YOU RECEIVE IT. 
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Commonwealth of Massachusetts ) 

County of Suffolk ) 

I, J. Edward Varallo, Registered 
Professional Reporter and Notary Public in the 
Commonwealth of Massachusetts, hereby certify that 
there before me on January 27, 1999, at the 

time tffjd'^lace specified above, Donald B. Rubin, 

F h . D . deponent herein, who was duly sworn by 

me to NRaiW - i f v to the truth and was thereafter 

examirli^lnder oath by counsel. 

C lLj 

C~ I certify that the questions asked of the 
ereponfP^ind the answers given were taken down by 
me st raphically and transcribed by me using 



cornputmxxxed translation software; and that the 
foregc^^is a true and accurate transcript 
t h e r e ( 

certify further that I am not counsel, 
attorney, or relative of any party litigant, nor 
otherwise interested in the event of this suit. 


DATED: 


J. Edward Varallo, RPR, RMR 
My Commission Expires 01/11/2002 
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Iran Workers: Pr Donald Rubin Report 
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Enclosed please find a) the report, with attachments, of Dr. Donald Rubin in the above 
matter; and b) a list of references upon which he particularly relies. It le my 
erttandmg that copies of all or most of these particular reliance materials were previously 
ld*d to you in connection with the Northwest Laborers case. However, in the event that you 
' *• to locate copies of the materials, please lot mo or Barbara Harding know, and we will 
to provide copies of those items. 
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* Please further note, as referenced In fbotcote 4 on page 9 of Dr, Rubin's report, that a 
computer diskette containing propensity score analyses of smokers and non-smokers whose health 
care was provided by unions in tbe NMES. sample will be sent to you under separate cover by one 
o^Dr. Rubin's colleagues at Us consulting firm. You should receive the diskette on Monday or 
Tuesday because, regrettably, the person who p erf o rm ed the analysis under Dr. Rubin's 
supervision will not be In the office until next week. 


wjj; to*aisvt 


■ tfYIJQi AW 


? 02/zo ' 


mm 


Hd 02:00 IHd 88-80-AdW ' 


http://legacyJibrary.i4csf.ecBdialji^cjfQtfa3iOi/|DytBlfv.industryGiQCuments. ucsf.edu/docs/xygl0001 







-v By: 

! 


Schwarzwald & Rock; 

1 


2166968062; 


Novi 1-90 3:10PM; Page 3 


t s 

i j 

Mr] Michael E. Wrthay, Esq. 
November 6,1998 
Page 2 

I 


JOKES. DAY, RCAVIS & POGUG 



j T)»nk ypu for your attention and cooperation. Please let me know if you have any 

quotions^occe^iing this matter. 
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OHIO IRON WORKERS LOCAL 17 LITIGATION 
REPORT OF PROFESSOR DONALD B. ROBIN 

'• Z am a professor of statistics at Harvard University. I served as Chairman of Harvard's 

Statistics ^Department for nine years, from 1985 to 1994. A copy of my most recently-prepared 

cairffculum vitae is attached to this report as Exhibit 1. 

k* I have been asked to analyze the estimates prepared by plaintiffs' experts, Drs. Harris and 


d Mr. Robert*, o 




Workers Local 


this Report, 1 e: 


1. Reliable a 
incurred by the pi 
fefleulated. 



pO 


Jams' allege 
due to the < 


Nor do Du 
0 ^estimate the excel! 
%#alleged rniscondiita 

S attempt to estim^ 
expenditures of tfje 

TO 4. Dr. Harris 
0 ^, to generate statimo 


pess health care spending by the plaintiff union trust funds In the 

ition attributable to defendant*’ alleged misconduct. 

vo broad opinions, each of which is explained more fully below: 

tlcaliy valid estimates of the health care expenditures, if any,' 
ust* a* a result of defendants' alleged wrongful conduct can be 


ffthe plaintifls' experts, Dr. Dement and Mr. Roberts, do not even 
excess medical expenditures of the trusts due to the effect of the 
B^nduct at all, including, particularly, the excess expenditures of the 
jo£ that alleged misconduct on the trusts’ behavior, 

Ifts* analyses contained in his October 12 and 22 reports attempt to 
epical expenditures of the trusts due to the effect of the defendants' 
rche trusts' behavior Dr. Harris' October 22 report does, however, 
effect of defendants' alleged misconduct generally on the medical 


I > - experts'are unreliable and rietUticahy invalid. 

V _ •'.•■■Riril i bfuoW v:- new xiAvtijvibi.i rt '-y*.v’ 


m um* 






vd js«sd;'dnr aav- , t.y 






X 


X 


Dr. Harr |^ift <b , ses essentially have none of the characteristics that they must have 
to generate stxtl m^Jfyvalid estimates of the trusts 1 expenditures, if any, that were incurred 
because of the defendants’ alleged misconduct generally. Those estimates, confidence 
f. . intervals, and any statistical significance claimed for those damage estimates provided by Dr, 
® Harris canseOucntlVhre urireliabie'and statlxtiesllv Invalid'" 




fsl^. Itcappearir’tiiat *tdl Three -expert* have 1 attempted tV'estimate the excess medical 
* expenditures of the trurts due to the existence of smoking. But .those estimates, confidence 
IrtervalaJ^^l&^f^fisSSd'SigrMatt^^HaihiS'fortiitiseestiitnatcsprovtdCdby the plaintiffs’ 
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HOW TO ADDRESS THE QUESTION 


A. Introduction 

To address the question of what excess suras, if my, the trusts expended as a result of 
dcfendinUfaJle£i|d misconduct, one must compare the health care costs that the trusts factually 
incurred tft^wfiat those costs would have been in counterfactual worlds without the alleged 


mlsconduci 
occurred, a 


haveoccuj 


the trusts' 
model. 


.fc^what the 
CTroitart, 


one ba* to specify precisely what the alleged misconduct was and when it 


The b 




what wo uld have 1 


use there are he 
)jts, two model 


model would 



funds can redSvcB excess 


including effects on the 


tret due 

Qi| 

Finish 


in the absence of that misconduct and when it would 


lata on the effect of defendants* alleged misconduct on 

•» , ■ 

e used: a behavioral model and a medical expenditure 


Up or two components, depending upon whether the trust 
A to (1) the general effect of that alleged misconduct. 


or (2) the cflfec^Tthe defendants 


the subsoquen 
plaintiff trum 


trusts' bebavio^pL u __. . ^ j,**,*^ 

Wpjf. « • «... ~T. •' :.a X'xfVUa 

: j Tha m(^“complicated behavioral mode! would have two components. The firacwould 

address 4 1a«£vi^^mhiated^_ch»riges In smoking behavior and the second would address.^ruat-> x 

- - v. , --«v ■.. .an* 11 ’'* rr -U :>i sut :• lt» 

initialed" changes In smoking bdinvior. T[^ feat component v^ld estinu^eJ^nf&ct jpf the alleged;^ sar i 

mis induct on the subsequent smoking behavior of individuals who were or would have been trust 
part tapantg, assuming no trust-initiated behavior was inhibited by the alleged misconduct 1 call this 



■t behavior* 


1 the behavior of the individual participants in the trust, 
lisconduct on the trusts 1 behavior. 

I would estimate the effect of the alleged misconduct on *>>;: 


ing behavior ofpg^duals who were or would have been participants in the 
dless of wheihepli|ip%hangc in smoking behavior was due to changes in tlie 

^ ^ *« r * r«.* 


■ -S ■ -I 
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the ^individual-initiated” component of the behavioral model because Its focus is on the savings, if 
any, to the trusts, due to possible changes in individuals' smoking behavior ca used exclusively by the 
•Beet on the individual of the lack of the alleged misconduct in a counterfitetual world. 

j The second component of the behavioral model— the “trust-initiated" component — would 

W 

estimate the effect of the alleged misconduct on trust behavior in further reducing the smoking of its 


individual members through, for example, preventative programs. I refer to this as the “trnst- 



co mpo nent 





so possible 


ehavioral model bccauso ft focuses on the savings, if any, to the trusts, 
individuals’ smoking behavior caused exclusively by thecffect'of some 


viortl action taken by* the trust as a result of the lack of the defendants' misconduct in a 


counterfactuaJ world. 


The second 


passes®- r ~~~—' 


-~1he medical expenditure modd — would estimate the effect on the trusts' 


ll a trust iroulcl 


jeed costs, if any. 


OF 

p4f 


dual trust n 
There ere 


ress the quest) 



!ges in smoking behavior estimated from the first modd. The cost savings 
|||ed In a couruer&etual world without (he alleged misconduct are the 
exchisivdy by trust-initiated changes in the smoking behavior'of 


essential features of the proper data collection and modeling approach 
l by this lawsuit 'For example: - r • - 


1- Both modds must control tor important background and other confounding variables 

^fggs* *° ^hai they compare behaviors and costs tor Eke individuals, that la, for matching' individuals 
f 1 , w *ho factual world (with defendants' alleged misconduct) and in the eounterfactual worlds 
P*W$ (free of that alleged misconduct). Jsbcm ‘mtatirdsd an 


lsbcj«* rdnri -.*»• 




j misconduct occurred and continuing through each subsequent, relevant year.. 
j 3 * Both models must focus on the population of interest. Here^ that U those Individuals 


litigation. 


i 

) ! 
J V. I 


tv? .-r. ,• 
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-r. Both models must consider distinct types of smoking behavior that can lead to 
different health care expenditure outcomes. 

5. For those aspects of either modd that rely on assumptions rather than data, the 
assumptions must be explicated and justified. It is critical, moreover, that each such 
assumption be capable of being individually assessed and altered. 

6. possibility that smokers' other health-related behavior may be affected by the 
aiiegedmisconduet must also be considered. 


7. L. The statistical techniques employed in both models and in the studies upon which 

a els rely must be rt^ablc and statistically valid. For instance, missing data must be 
in an «apraariate%^er- adjustments for background and other ^confounding 
must be made afpKaking into account di ffe ren ce s in the distribution of those 
va^Bfigg^the groups Kon^compared; and sound statistical methods must be employed to 
reflect thjpreliability and un peSInl y in the resulting estimates. 

8. WmH^ aith care costs smoking that would have occurred regardless of the 

de&odants’ alleged wrongdop^^^ld be secluded from the damage estimate. For instance, 
any iijjjreased health care cogsdue|o smoking that occurred before any alleged misconduct 


: sr;?:rs...". 



allegct 

cxdud 



noutabie to 


I the c xteatii hat the 

effefrj 

»in heaithcire colire 

I misconduct and | aw 
Initiated changes inSftt 
pm individuals w tej^jl 
^conduct, regard! e sspj 


Jy, expenses 
m any potential 


m* misconduct, 

limits the imsls' recovery to excess health care costs 
defendants’ alleged misconduct on the trusts' behavior, 
Hi would have occurred in a counterfactual world without 
S 3 $t any modified behavior on the part of the trusts (it, 
ttang behavior) must also be excluded. For instance, cost 
rsm okitig would have decreased without the defendants’ 
'whether or not the trusts modified their behavior, must be 


in instituting preventative programs would have to be 

§s due to reduced smoking prevalence arising from such r. :: or 


'. U- •* lik t . *« 

tavloral Model r • 






Consi^Jlgt ihejnodd .of.the effect of defcndants'.alleged.nniconduct.on the smoking 


The one-component behavioral model 


. . •>. rr, .1 

.... t . :‘\k. s: r.x. ■ '■'vub ot 

^.■^,,’".,..5 . 'r.*- ^j. ,relf. 

.*■ u-nwsii; i-SfeabsuarirSoeofi); 


infbrrrmi 


"isvv T.'icvni; j ,"- r iejciL'..rt rv,t rg;::*;.:' b«s baviirvos* SaaiinepCirn ; 

in die pertinent population. Suppose it were alleged that the defendants failed to disseminate 

ilsubivilmS worir t; tarb /-t.H .s»*.£ r~, tv riuiw-.viw ..-Ci.-. aunr. tij.uevc mr*£; 

xtion aboutthe £ahb pflfccci orcjgarettjjfnpkjpg {n JfcSS.5l would frist wist to. estimate the^’d^v, .. i 

‘ v”. 
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effect of that alleged failure on the smoking behavior of the individuals who were or would have been 
the recipient* of the trust-funded health care. For each distinct type of smoking behavior that is 

I * • 

considered relevant to health ear* costs. I would estimate its prevalence in the factual world and in 
F” thh counter-factual world. 

“ISgp 5 This estimation would be baaed on factual world data and on expert opinion concerning the 
on smoldng cessation and smoking initiation of the availability of additional information about 
king's health rule iihe availability of alternative smoking products. S u eh a model of smoking 

i'to consider background characteristics of the individuals who were or 
of the trust-funded health cane; control for those background 





whaha vinr would d«i 
Id have been 





actcristics and c^^^^nfounding factors that alter the effect of the alleged misconduct on 
j&Soldng behavior, ar<pSiiB|sr how changes in smoking behavior in the pertinent population would 

~. L I 

of the damage period. 

:rive of an analysis is solely to estimate aggregate quantities (i.e., nut 
efined by year of birth and race), one must generally control fbr 

! .1 ’ ’ Aspy*" r j ^*-’•—** * * r .. 

ground charaaerptlCS'T^uch as year of birth and race) and other confounding factors (such as 
mokiog hedth-r||g||^$ehaviors). which may alter the effect of the alleged misconduct on 
ing behavior, in i 
The task of < 


r the resulting estimates and inferences to be valid and reliable. 

a behavioral model is not as daunting as it might appear: simple 
fOCi hsgaiis 9 ‘,i- •&£»"•* «*<• -1..6* -• , : : . • -■*" 

locations of factual world prevalences of different types of smoking behaviors within specific 
Jftsntaeq orif n? toivadwu .gaui&oc ' •' : ‘* v * **- r 

pa of people lead to estimates under explicit assumptions. 

nUtrk-. r>< y-- ewoasn *i tirlT .bnooq jn*v?k-r erl; b i-i’ : Vv • ~r- 





.y. i 


v'jici t: c.!r{Lfcn* nfi^o evlio&idi' ed> !s,v; /-’ashei hr* _• 

.iti-ii; biifc-.rinldlo V tor.Bcb tx :»l* tqwig&t m>rv: '<* 
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2. The hvo-component behavioral model 
Alternatively, the two-component behavioral model would permit the plaintiff trusts to 


estimate the excess health care costa, If any, that they incurred aa a result of the effect of defendants' 


alleged mis 



on the trusts' behavior. 


a. Individual-initiated changes in smoking behavior 
the “individual-initiated” component of the model of the effect of defendants’ 


»V' f *i misdpgreHiSi on smoking beb 

i ft—* . 

alleged that the defendants foiled t 


smoking in 


I would first want 


’the pertinent population. Suppose, again, that h were 
|inate Information about the health effects of cigarette 
fcmate the effect of that alleged failure on the smoking 


behavior of the^ndividuals who wero irwSq Jd have been the recipients of trust-fUnded health care, 
assuming no in thcJrusts' b ehateor in the countcrfactual worid. For each distinct tvnc of 


smoking bet 
In the foetus 


10 changes m 
hlTOrTnftt II 


tn the foctuaP^ ^^ snd-the cotmterfi 
, This cs tR^a^ 'on would be aim 
would be based ^^ Su and expert opii 
initiation of the pill bilitY of additions 


gin the countcrfactual worid. For each distinct type of 
t to health care coats, I would "estimate” its prevalence 


hat described shove for the one-component model. It 

, . * • , ...... * j't ■ -* ri 

iceming the effe c t an smoking cessation and smoking 
■uitiofl about smoking's health risks; it would consider 


background sties of the pa^^^ts in the trusts; tt would adjust for those background ^ 1 

i -IS&L ; ' •- * i .* 0 ; c*si 

characteristics |l?npthef confounding factors that alter the effect of die aSegod misconduct on 

l r .-, v , ... . . .. ... .. . - .... .. ; . i.ifiltiSrilUO®' 


ieristics of the 


smoking beha' 


population woul 


it would consider how changes in .smoking behavior in the pertinent 
m 1965 through the end of the relevant period. This is necessary to obtain 


estimates and inferences that are valid and reliable, even when the objective of an analysis is solely 

i 

to estimate aggregam quantities (L«^ not within subgroups such as defined by year of Hrtb and race). 


. Vr^ 1 gy'-V ! vV 

* ‘ i-'Vt. ’ * 
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b. Trust-initiated changes in smoking behavior 
Now consider the second component of the behavioral model, the “trust-iiutisted’' component, 
which concerns the behavior of the trusts due to alleged misconduct The model must supplement 

r \ 

LtWprevious analysis with a posited potential reduction in the prevalence of smoking, also 
f dit«cg rec>ated by background characteristics and other confounding factors describing the trusts' 
^members, that the trusts would have caused in the absence of the alleged misconduct Suppose again 
were th^^^efendauts failed to disseminate information about the health effects of 

Hte smoking in li>6fr 1 would want to estimate the effec t of that alleged misconduct on the 

bmmmi 

■ of each trust, ted than esti ma te the cfjfect of any modified behavior by the trusts due to lack 
of alleged misconduct smtEreHsmokinB prevalence of the individuals who were, or would have been, 

. Mrt 

uus^^J health care. For each distinct typo of smoking behavior that is 
P^^rdMt to l ^^t eg re costs, I would again ‘‘estimate’’ Its prevalence in the counterfactual 
members of e^^g^ The analysis would. In general, require a separate longitudinal model 
^ because amp^btial trust-initiated changes to smoking behavior due to lack of alleged 
iduct could have harftshacross trusts and in time. 
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mm 



cn 
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Health fifjxlidicureJ Model 

Once I had tno dfljjt^ji bte effect of the allied misconduct on smoking behavior over time,! *■ 
w^^then modd the effectL of thatchangedsmoking behavior on the health care expenditures of the 
‘fc n,cn, bera.^,.h;. 0 ; 1£ « n bsgatl»<hdsd«wbhow j.ty-.ori *>*: »&».**&:• ?-»>*■ ■“ 

model of the effisetsof the changed smoking behavior onitcalthcareexpenditui^oF tHe 1 ^ 

trusts* membera would alsohave to coniidcrcharacteristicsofthelndtVidiials wfcbSvercbr wouM ; r ' J> ^ ***''■■ 
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such as health-related behaviors, that may be important predictor? ofhow smoking behavior affects 
the medical costs of the trusts’ members. As with the .behavioral model, ideally each such 
characteristic would be measured on each individual before the moment in time when the alleged 
mlsconduct ^d^^ effect on that characteristic for that individual. 

The fjcaJth #are expenditure modd also would have to take into account the passage of time 


because costs - accumulate In time and smoking behavior can change in time. 


el may have 




literal similarities to the plaintiff's experts 1 medical 


j. , „ r 

expenditurenrcSKp presented in lifig^do| brought by the various states against these same 


defendants. 


consideration of the passage of time. 



Ithout their errors^y hh^ implicit assumptions made explicit, and with proper 


us&ative Exam pi 



s; they are not meant to suggest that the models need. 


Example One 


ding required here arc illustrated by & baste example. 


■ n 


The pBmBMa. 1 chariWfitics di|he models necessary to address the trusts’ expenditures 

# krnim i 

incurred du defendants’ afleaed wrongful conduct can be illustrated by examples. These 

examples are Illustrate speci ^^j^te s; they are not meant to suggest that the models need, 

to operate at th^T^el of detail. !»»»»»»». 

aamMMMaflTO 

!Ir™v Example One 

• ^ ..j 

Tour PWBi^sential features oft'' .n^dcling required here arc illustrated by a baste example. - '• } 

■j 

First, the examp ^fe ponstrates that, in order to estimate separately the effect of defendants’alieged ; ' 

mi^onduct on behavior, one must consider the factual world with the alleged misconduct 

and at least two Ijluifcerfi uxuai .worlds without die alleged misconducyone without changed behaviorbom r^iT 
J pfjp™p 

on the part of the jrutt asd-anotherrWith ’possibly changed behavior on thepast of the trust r ln «5«r: 

«• . 

contrast, if one only wana^taugstunatc the ; efleet- of the dcfrrtrlmn* alleged misconduct In the i t ~-n rar e -? v - r 

aggregate, then oi^^^^j 5 c^dqrjth<jftctuaJ world;whh^thcialleged icisconductiehd^ohly'ond ercooni , .. 

: z ■ ■ 8 * . C 

. ... • .. • • •!*-„ - .. - '-Cf 

u ■> 


BEST 




htt|^^/legacy.library.ucsf.e<jBil^t^a^J^|[roc^^ff^E£3^WpfflH^v:ihdustrydocu^r1bnts.ucsf.edu/docs/xygl0001 ‘ ' : ** 






eouuicrmeui&l world. Second, it shows the need to consider background and other confounding 
characteristics. Third, it illustrates the need to consider the passage of time. Finally, the example 
illustrates that there is modeling uncertainty about what would have happened in a world without the 
alleged misconduct, and so assumptions and bases for assertions about counterficlual worlds without 



I*-the alleged misconduct must be explicated. 

L In this example, we consider five worlds, one the factual world as it exists and four 




t the defendants' alleged misconduct starting in 1905. Two of the 

“ * ■ » I*' fc' I t fl , B P 



reffectual worlds 

Iworidchaivend^^viordrnot&fidttlon on the part of the trusts, and in the other two, 
sts initiated smcimgcfttau’on programs. Wc consider the same individual from one of the 
trusts in all of the which obviates 'the need to control statistically for background 

sties and unding factors. We then see how this individual’s health care costs 

pili igifl dme Av^^oHds, which leads to the calculation of this individual’s trust’s “excess” 
.Jtofeb -care comdue tdl^^eged misconduct. 
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Coviitcrfeetiul World*; Defends at*' Disseminate 
' loftmatllM m Heiitk RJtki 


:3 

13 

.3 


Factual World: 
Defendants Withhold 
In/ormstfon 


Bom 
Sura tmokii 


Trail 1 ) Behavior Unchanged 



Trail Initiaics Preventative 
Program In 1970 


Counter-factual 

World I 


OouiiterfaCtuaC 

World 2 


Coantcriactual 

WoridS 


Bora 
Suns smokin' 
Defendants 


Bom 


Defendant* 

Infbautloo 
risk* - eoatintemfe / >«nW*;, rid*, - 
lag * $' smoking 


Born 

Stans smoking 
Defendants dlstcmi* 



Starts stnalang 

dto aStefe Defendants dissemr 

sate fav/oTxnatii|Poo’ nato information or. natc lnfennation on 
v — , * 4 ~ " ,J— — 1 —risk* -eonihv- health risk* - quid 

smoking ■' ' 


suh 

Quits smoking 
Qullfar 


Quit for 5 years 



ties rinaldng 
SHU smoking 


Counicrfac-tt/al 
World 4 

Bore 

Starts smoking 

Defendant! dissemi* 
nil* lnftnnadoa on 
health, risks • contin¬ 
ue* smoking 


Trust initiate* entok- 
ing prevention pto- 
gram « quit fbr 5 yean 


Trust initiates »otok- ,. 
lug prevention pro* 
gram - rprits smoking 



/V 

Quit for 20 y ear igMgggj Quit* smoking 

Qult &sjtf ronrtpvl^.,^ Quit fbr 5 years 

ter countertaerual iffflffiwhera there arc individual-initialed changes in smoking 


Quit for 20 years 
Quit for 23 years 


Quit for 15 years 
Quit fbr 20 years 


the Lack of the 
re costs due to 



misconduct. There, the measure of the Increase in the 
its’ alleged 1963 misconduct is different in 1970 than 


it &in 1990. %jssg3caUy: 

- the measure is a comparison between a thirty-fivc-ycar-old who 

1 contimiou||ptM5 yean m the factual world versus that same thirty-five- 

‘ ^-oid who would nave smoked for ten years but then became abstinent Jfor live 
> in the countcrfactual world. 

&&&90, the measure of damages for this same person is a comparison between a 
pa ^ve -yafrold who smoked for 30 years and has been abstinent fbr five year* in 
^S^Mtual world versus that same fifty-fivo«ycar-old who would have smoked for 
only ten years and quit a quarter ce ntu ry ago in the eoumerfbetuai world. 


i 


Th^ measure of any increased health care eosts of the individual due to die alleged misconduct by 
lanta toward the Indhddual plainly wifi be different in different years, requiring a longitudinal 
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analysis commencing with the dele of the alleged misconduct, or its first effect, end continuing 

thrquch each subsequent, relevant year. 

1 

Also, notice that, if this individual's smoking behavior were unaffected by the alleged 
{^conduct, as in counterfhetual world 2, end, therefore the person would hrvo the same smoking 
jstory in the eountcrfactu.il world as in the factual world, there would be no effect of the alleged 
induct on health care costs for the individual. 

Now congidoNluMbteeis health care costs to the trust caused by the affect of the alleged 
on the Itustsaig Relative to the factual world, in both counterfaefual'-world 1 and 
atc rf ac tu a l woridi^^e axe no excess health care'costs Incurred by the tnist due to the effect 
be defendants’ alle^ ^^ onduct on the trusU: the trust's behavior was unaffected by the alleged 
That is|iifiS%h there are individual-initiated changes In behavior in counterfactuai 
fwog d 1 (Elot i C ^pui^erfacmal worid 2), there axe no trust-initiMed changes in cither 

terfacmal world 2. 

Now consider cbanprfaetual worlds 1 and 3: the individual’s behavior is die same in both 
rrterfactual world ^uHaS ^ctcd by the trust’s modified behavior due to the alleged misconduct 
; scenario, once Jp^g^ere are no excess costs to the trust due to the alleged misconduct an 
trusts’ behavior bUlllInthe individual’s behavior is unaffected by the trust's behavior The 
foes in behavior from the factual world are solely individual-initialed. 

Finally, consider counterfactual worlds 2 and 4. In this comparison, the increase in the bust’s 
health care costs for this individual due to the effect of the alleged misconduct oo the trust 
i extra coats from 1970. when the trust initiated its preventative smoking progr a m and affecte d 

the Individual’s smoking behavior, through ttejpjd oftfxtjel evant period. This amovntls'in contrast tujoia 

\ i.t-l Ananro .toalag-sanioiat tiupvim»oamwqoi^ra*ru»wiadJ 

to dte potential increase in medicajjCflUi ^fejn&yjdi 
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would begin to accrue in 1965 If that individual's smoking behavior changed in 1965, as in 

t 

COU^terfectual world 1. 

2. Example Two 

J 

t 

* In tldrexaJ^ple, there is the factual world, as it exists with the defendants* alleged misconduct, 
and two coumilfflttltual worlds without the defendants’ alleged misconduct In countetfactuat world 


1, vjhhout 



misconduct, there were no behavioral changes for either the individual or the 


tfua {- 1° cotdH^rfual world 2, withebfe the a lleged misconduct m contrast in 1975 a trust-initiated 

ised treindiyiduai to 


smoldRg pre^^g^rogram caused l 
the Individ ‘ " 


lu^^^ccorapanied by o 


stop smoking, but the cessation of smoking by 
1 which had its own adverse health effects. - 



r * 

pQ 

T* 

4) 

O’ 

K 





a, 

pWm 





:ix l /' ■ r-i ^ 


no ;s..«r.*fl35:.v » ear *«? iz’sliz . .: 


ceiss&c z-rc. ;nr.. 






,i* "1 * i*i .*•- ; • 


I 

i i 


A not unlikely possfcffity. Set, ±g., Amcricart J.'ofEpidcjniology 1998; 148:821-830, 831-' 

832, fc^gestiflg that women who permanently quit smoking gained, on avarngc, 19.2 pounds 
over five ykSrs^The-“corresponding wrf^jt?gafai'for#i^«S -1657 c 


.* si?. 




12 


U 


I ~ ..^^^*1,; **-\v -**V ■: • %!'*. : ■ 


C 


http://legacy.library:ucsf.eSiffittiDbfpDb|Q^afMW|Oclt. mdustrydocuments.ucsf.edu/docs/xygj0001 



t 


Factual World with Alleged 
Misconduct Starting In 1965 


• Counter-factual World: Defendants 
Dbltmjnate Information on Health Risks 


CouaUrfactnal World 1: 
No Chance in 
Trust Behavior 


Countnrfarfual World 2: 
Trust Initiates Smoking 
Prevention Program 


J 5 Bom; has family history of Bom; has family history of heart Bore; has family history of 

heart disease Hisrtwe heart diseas* 


US 

C' 


5 Starts stroking heavily; normal Starts smoking heavily; normal Starts smoking heavily: nor* 
diet diet mal diet 




-5 Defendants wj 
tion oo health i 
smoking | 

^ Still smoking I 
overweight 


15 No change 
M Contracts lung 


id informs* Defendants disseminars infer- Defendants disseminate infer- 
C ontinues motion oa health risks - cootin- (nation on health risk* - still 
r* . uessmoking , , . :• *w»o#aldag->• - 

ityl slightly Still smoking heavily: siighdy Trust starts smoking prevea* 


overweight 


No change 


(ion program - quits smoking; 
gains weight 

Becomes severely overweight 


Cootractt I 
No lasaw 

1996 "#pilg^ 




dies Contrasts lung cancer and dies No change 


No longer alive 



CJ> 



Has heart attack and bypass 
surgery 

Ha* second heart attack. 

Still receiving health benefits 
ftora trust 



This example Q P p ra le| additional essential characteristics of the proper data collection and 

approach b«^jgg||y|» need: (a) to consider the behavioral modification of trusts and - * • ■ -;f.- 
MP^ uals in counterfactual worlds, (b) to consider background -and -other ■ confounding— 
e|§gg£tcriiUcs, (e) to perform fhp analysis over .time, -and (dj tt-explicate assumptions, v First, it r^: • - ' ' 
Ull pliiS * that one must consider the pos«ibgilv-lhat smokers! otter-health-rdated-behavion teg..* ■ m--.:*- i; 

ovupaxing or alcohol consumption) might be different in a counterfactual world without tho. alleged mi:- *v -r 
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misconduct Second, the oounple illustrates that trust expenditures can be higher in a world without 
the alleged misconduct than in a world with the alleged misconduct , 

Tn particular, in the factual world in this example, the trust bears the medical expense of this 

j 

Individual’s lunc cancer and related death in 1994, whereas in the counterfactuai world, the trust 
beads the a^fe^xpense of this individual’s first heart attack and by-pasj surgery in 1995, as well 
attack in 1996 and all medical costs until the Individual dies in 2005, Clearly, the 
it caused by its^pwn behavior in the counterfactuai world will be substantially 
than either the i HTVorld or a counterfactuai world without trust-initiated 




idutl because of th|yjj|||g expenditures needed to treat the heart attacks, by-paw 

surgery and Strife medical expendi^^l 

j r r! - 

j The he^flh care expenditure moHeTafsa would have to take into account the passage of time 
because costs ixum tatte in lime . Im iMfeflf comparing total costs through the end of the damages 

J C 

posts up to some event in time, such as earliest death in 



puwu, no' 
anyj world, instance, in 


individual oc 




994, his health 



;plc two, where the earliest death of tins hypothetical 
t stream through 1994 would be less in counterfactuai 


, S - 1 - 

wo|id 2 wrthoCLtiJb defendants' allc Ied"'ni^ conduct and with trust-initiated smoking prevention 
programs titan 151 factual world wx^^^idants 1 alleged misconduct (or in counterfactuai world 
1 vjithout all^ ^^p sconduct or chan|^|j^he trust's behavior).-' ’’ -■ - • 

j Becausc^pw alternative worlds are counterfactuai, there Is no direct evidence to estimate 
How Ekely each |gg^d have been la she absence ofthe alleged misconduct;Instead, evidence from * 

dat^ observed i ^^lfac tual World-must be ■’Coupled with assumptions to estimate the tikdihodd of "* ,,T 

4 

alternativeoouittecfaB6ial-wi»l(blh» cte&w s: tr >- i*' ! - ' s: c 
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Finally, from a statistical perspective, such an analysis in each counterfactual world need not 
be done Tor each individual who was or would have been a participant In the plaintiff trusts. Rather, 
such an analysis is needed for each subgroup defined by the pertinent smoking behaviors, background 
leristics and confounding factors discussed previously. The subgroup specific results are then 
jated across all the subgroups to create an estimate of each trusts' costs due to the alleged 
induct 

Comments On The Task 

i from observation*] data In this way requires care. Ideally the effort 
' a team of experts, Including ones having knowledge about medical 
I models. Nevertheless, there is a straightforward scientifically and 
1 drawing causal inferences about any medical expenditures due to the 


jicntifi^^butderiying assumptions must be explicated In detailed and disaggregated 

' 1TTr r T rr f , 

■ iffif u. This way, thapggmpttons can be assessed individually for plausibility arid can be altered 
mygtvidually to allow a^jqva tion of the consequences on answers. When all the assumptions are 
tSualCed together, the L 
SfSt Statistically, 
can be vi< 




analysts may be little more than a subjective assumption of die answer. 


of defendants’ alleged misconduct (specified by its character and 

r, . s. ;* • ?•!.»£?* i -S* 


- • . V i ’ ■ 



n 


efintog a level of a factor in a hypothetical factorial experiment For 

. .si:.:. .r r:,..- *.*• . ;-is; r..- s~ vet rtiLio** ->.-»} -.or-? 

pie, one {factor could correspond to the defendants* alleged failure to disseminate adequately 

■ t r vt* -*• 

on regarding the health risks of smoking, and levels of this factor could correspond t^Jho... _ . 

■ ■; yltmiree or yjdfrs on erxs.T cr T v.">» •C' 


•- ’’■n-ts* /»* 


j Anovixw^ oftlMfraowworkis presented in Rubin{ 1998),7What 4oes£^wUteHestiiiyU« taivwbd '•kV's-'. 

| the causal effec ts of'‘smoking*?"-Invited paper, August 10, 1998, American Statistical 

j AlwdatiotiSKtiOB onEptde mip lpcy.tfcr.cH .*Q • ^ourtlbodqnse'tcoriflfetfleWOTt »dl no'ibNiejle?* 
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kinds of information and their dates of dissemination. A second factor could be the alleged failure 
of defendants to market “safer* cigarettes. The combinations of different levels of these factors 
would define the defendants' conduct in a counterfactuaJ world without specific acts of alleged 
misconduct. l&suekof additive versus synergistic effects of the alleged acts of misconduct would be 
addressed byjj§Sred interactions in the hypothetical experiment. For instance, in the absence of all 
interactions, PPcllfs due to specific alleged tecs of misconduct would simply be the aggregate (cum) 
of the costs iilwrihe specific, indivfdq al ajlcged acts of misconduct. This framework allows the 

assumptions 


explication and dise ntangling of 
Jth 


' ®f 1 T*- 

tnupr health Cftrejjpsts due to defe 


ipti^ru needed to address the question of the effect on the 
leged misconduct. 

i AparjRpMilany applicable ic ffuWh sidcrations. the question of the effect of defendants’ 

I 

alleged ntiscond fo g jk&tn he health caro ex pendit ures of the plaintiffs, in principle, can be addressed by 
statistical analySgp&f appro^Hl datal^pld with explicit statistical assumptions. Any statistical 

distinguish between smoking behavior affbeted and 



analysis that can iddres*t&$PS^cstii 


unaffi 


‘ected by 




'sconduet; take 




in background ^wmctenstics and o 1 


smoke and thojPnpt exposed to 



nt the passage of time; adjust reliably for differences 
bunding factors between those exposed to cigarette 
smoke; focus on the relevant population; explicate 


assumptions; uJff^l lid statistical met^eh^nd account for the timing and nature of defendants' 

| T it tarn, •'* v. ' ' • •** r • 

alleged miscondtWPg^ axchide anylh^^m health < 


nj£ki* 


. ■ - ‘ - . • »■ - - 

i care costs due to smoking that occurred before 
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‘ by defendants). 

•jJioqrs’TJCi rstoJi 3:. 


f. * 




no effort to estimate separately the effect of defendants* alleged misconduct___ 


on the behavior ofthe pltteiff Cnists'aru;. In turiv tfie effect of that change in tmst-lniiiat«i smoking *“ * J ^ > ” 

! bsobaiScrE• n&oftvaA ,teCI .01 niijiuA ,tku-j Cgnotaira* te atoer* *;.5 

hoh^vior on the trusts’ health care expenditures. Dr. Harris does preseci cstim&es 6T lhe‘efFbct*'of :i 
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both the existence of smoking and defendants' alleged misconduct generally on the trusts* health care 
expenditures. His analyses, however, have essentially none of the characteristics that are necessary 
tq | estimate the health care costs incurred by the trusts because of the existence of smoking or 
(defendants* alleged misconduct generally. As detailed more fully below, Dr. Harris' analyses fail to 

(.yyw tjfl I 

IK 

aj : tmke into account the passage of time in the manner 1 described; b) control for important 
^^Pba ckgrouod and otht funding characteristics; c) distinguish between different types of smoking 






vior that can leaitUo different health care expenditure outcomes; d) focus on the participants in 
plaintiff trusts; e)Sillifiegete assumptions in a way that enables the assumptions to be assessed 
varied indtviduaU^^^^ the consequences on the resulting estimates; 0 consider the possibility 
far .ivikers* other $H8§4elated behaviors may also change in the counterfactual world; or 
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ical methods or rely on the results of studies that do so. • 

Of Smoking 

„ . i—1 . ' 

T o estimate thehe*hh care costs incurred by the plaintiff trusts as a result of the existence of 
* smoking (as ppposl d to defendants* alleged misconduct), one has to compare the plaintiff. 

r.'y 

i* health costs in t hitifasa lal world to what those costs would have been in a counterfactual world 
hieh no cigarette existed. To be reliable and valid, such an estimate must possess many , * 

e same essential Oxars^erisrics that I described above in the fim section of this report. ^ 

& Dr. Harris* estimates do not have those characteristics. His estimate* of these smoking 

JSSsbulsW* expenditures depend on two quantities; the rdative expenditure ratios of ever smokers . 

fZT'Lj ; fr •->*»? - 5 , 1 ?■ --'rr-: crv.rr,- r-*;■• <•: ;• :«■*} »ursri»j:~-*rrj mvr^WUCO 70 suellLS-ort <"■ 

smokers, or *r,* and the prevalence of ever smoking, or "p." His factual worid values far .. 

t • O .-rr+cwo'* • *. J .-■•ft >-ti even as*** ^ v.*—v 

boih "p* and V are unreliable. . r .v 

} xai tsst-v esffismJEii.* ‘-y . 

Dr. Harris 1 factual world ever smoking *p' values are based on a ptece-wl*e linear model- « *-w 

vtomileeT'lehT *r (tree I im»A vtswrM) ft |5» f tfl-i attouH 
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at 1974, is not calculated from any particular data set, nor allowed to vary by background 
characteristics and other confounding factors of individuals, but is simply postulated by Dr. Harris, 
based on B&su fod cdve assessments. The line reflecting ever smoking prevalence from 1953 to 1974 
does not IctBgft& to be a close match to the data points on which Dr. Harris relies, end there is no 
confidence uyerygl for his determinations, even when they appear to be based on the analysis of data 

• nr! * K*. ■ 

under partfUlSP^ssumptlons. 

In HS^ning bis values* Dr. Harris relies on a group of studies of health care 

ejfeenditurfeslj^hgre are a variety ofeg^^ems with those studies, illustrated below. 

j F™^!. . ■ 

First, it is well known in manycnmches of statistics, epidemiology, and economics that it is 

er confounding variables when trying to estimate causal 
joiifi|tions where, for example, similar smokers and nonsmoker* 

j * ■ '"*0 Mmp “ 55° ^ “*• iK *“ h — “ p ” dinw - 

Even when iq$|r|j^t focuses solely t^l^Pegate costs and not on subpopulations, background and 
other confounjjjffi^'ariablcs that defil^OdfSeopulations must be considered to obtain valid estimates 

• i 888898888888688^' .«,*.•■ 

p* aRWBW ’ 




of aggregate 




Nev^m, some of the s ti^^jri ied on by Dr. Harris do not even attempt to control for 

possibly fepot^^bhfouiniBng variables; others attempt to do so but do not adjust for those factors 

E - ,, m “lo -.v.'.;. - ^ .* ; 

bly. My fraa ew suggests that none of toe studies relied on by Dr. Hams for hit values of V . 

. .rrjptar ‘Wfslo*&>*si ?*.yribm*c» ovhchn srtJ :eouijfuu,‘£ o «* :>»-*»(.■?• *:-••.•hnc-y-.s* * c*‘*'-r&* 

t for the njy jl. : u f background characteristics that even the experts retained by the plaintiff in 

■.di eetdev bbow l>uae£ eiH *'e ! '. in. t *jsnio'«i. vv*. Is ausel-ivr j»:1; .v..t t:- ■ 

cases have adjusted for.- See, L.S. MUer, X. Zhang, T. Novotny. DJ\ Rice, and W. Max, 


r-.cL-rr.aio!. -u 


“State Estimates of Medicaid Expenditures Attributable to Cigarette Smoking, Fiscal Year 1993,' 


Nubile Heali* 


D 


aeubnr •*o*‘ snbhsov wv» bfrow 'fhaaH- ."rfl 

1998) at 141142; Minnesota Trial Testimony of 

so totieiasii bebivorq 

of Dr. Wyant ^ 



r** J . . f:: 

■... .‘Hi- 


c 


■y-‘ ( .i 
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at 5372-73 and S39I-P3; Oklahoma April 27, 1998 Report of Dr. G. W. Harrison at 18-20 and 39; 
t MiQer. tt at. Smoking Attributable Medicaid Cart Costs: Models and Results, September 
1997, Tables 2. <4-2.8 at 23-27 and 29-30; David M. Cutler, etaL, The Impact of Smoking On 
ticaldSpending In Massachusetts: 1970-1997, June 15, 1998, Table IV.2.* 

I < ? Moreover, none of these studies that T have examined reliably adjusts even for the background 

PSL«. they |iiPport to «$utt for. Pet « es 

tlememal Report |a lte ;#asou and in my July 1998 Report in Oklahoma, the differences between 
jokers and nonsmoid^i^MBS are so substantial that the attempted adjustments using regression 
frdda are known tc jirely unreliable. Furthermore, such differences persist even .within the - 




^^n-worip|^bset j 

> Sjc fl fldj^n one ^^^udies on which Dr. Harris relies explicitly considers a union trust fbnd 



>ienc population, 
dl feel icitly appears to 



less a population of union trust fund participants in'Ohio. Dr. Harris 
that relative expenditure ratios of ever smokers compared to never • 


P" eer * computed ba sed on y e general U.S. population or populations of employees of a given firm 
equally to the p a^dt^t uts in the plaintiff trusts, even though the trust participants may differ 
*■ ?rant ways ifHHSfeiso other populations. The ratio of medical expenditures' between - ' J 
•ken and smokers for the participants in the plaintiff trusts may differ from that of the national: 
population or of employees of a given firm. Yet, Df; Harris'analysis provides tvocvaJuatibn 
ier the values ofiWhe uses are applicable to the participants Ihri>e L plamliffufilo'ri triiits.' ,Vr b'-*- 



Ideally, any characteristic that might be considered important should be used because ^ 
subsequent analysts can always average over that characteristic, thereby ignoring It, whereas 
an analysis that leaves h out initially has dud characteristic confounded whh smoking 
behavior. 




A diskette containing this anabasis will be sent to plaintiff? counsel under separate cover. 
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' Third, baaed on my review, many of these studies confront missing data problems, but, UI 
have descpbedan my reports in other cases, they use techniques documented in the statistical 

literature t&^ggpf Sawed. See, e.g., Little and Rubin, Statistical Analysis with Missing Data, (John 

\ SP" 

Wiley, Panel on Incomplete Data, National Academy of Sciences, Incomplete Data in 

Sample 5t g^^ VoI. 1-3 (Acadgr^c Prew . NY 1983). 

Fol^iim|le, many of the: 

* foiiaggggggagB^ 

ndusoS resulting data s roT^^ t 

l 

instead, use sdenti£caCy invalid pi 
hi&uE of u c 


relied on by Dr.. Harris impute values for the data that are 
missing and use tlBr resulting data s^j|y||J|jthe imputed values were real data. But, in the instances 
1 have exarhiaffiggjftc studies do hot utf&tf&d statistical methods to impute these missing data, and, 

| Among the errors made by the studies relied on by Dr. 
Harris for hi/|gkiE of " Case : usi&yggg. predicted values for missing data, imputing the missing 

data scqueffllllrySvithq,u Ofoi >er c ^ffitior lng. imputing the missing data using only part of the 

; ** s®™*? p^Hl| 

available krW^^Le, imputing the 


on strong uns ^^u uiated assumpti 
toj conduct 

i 


^formative s« 

assumptions i^e.-riying the im 
o^ errors the data analyse 



without taking into account their uncertainty, relying 
g arbitrary values to replace missing values, and falling : .- 
of the .imputations, such as varying fundamental . 
I have described in detail in other reports, these kinds • 
nclusions about "Rvalues relied on by Dr. Karris are . . - 
statistically u hf§f§fc lc and invalid, . •«$«*, « z'-.ir. yuitotic. vz ,• kc zv~- 

® r - h|®SS^- analyses of both “j>,Vand “r"-. more ly * compare. ocyer and ever smokers (asa r 



l 

i. 

1 

I 

* 1 
1 




**^«d » 


tifor?ler,iands^3LKrmf^amoker»).^j Differences iiutmoking "behaviors ’ that 7 ?.?:* 


: t-.jzj* a bsei> &d blood* ^icnoqtci bsTshlenw »d irlgim.lhrij ch£»wr^''-Co \rr* yOssti 
U5-j»nw ;( ji gahongi ydswii r ciSths)asuho Jojb wo otrma 
gnuforat difar Jbabnatftnoo -obahattisriitb .**& aari. jfllafrini juo ii.cwci! sant 

' .-jOivirbiSj 

.-wvoo •m*t** *bau i»exsy<u 
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epidemiologists, physicians, and even Dr. Harris* and plaintiffs' expert. Dr. Lauer,* appear to believe 
o'important — e.f., smoking intensity, smoking.duration, type of cigarette smoked, or time since 

j 

it ting — are nowhere accounted for in his analyses of other *p“ or "r. M To be valid, such 



Upferences, as discussed above, must be taken into account in appropriate subgroups defined by 

lographic characteristics and other health-related behaviors. 

Finally, Dr. pSmP analyses do not take into account the passage of time, consider the 

Ability that other ftcaffi-jp atod behaviors (e.g., overeating or iUictt drug behavior) may be affected 

;. v; - - . . . . . .. 

(the existence of g^pkt& g. or explicate necessary assumptions in a manner that allows them 

individually to be kJd^^wrand evaluated or altered to determine the effects on the results. 

B- _.The EITjfcLP f Defendants’ Alleged Misconduct Generally 

to assess separately the effect of defendants' alleged misconduct 

pM®fi#the trusts' bchavit ^affipjfi cally (two-component behavioral model). 

0> Dr. Harris d 

ieraliy on the 


'18 n T1 

DrrfWfts 




to estimate the effect of defendants* alleged anti-competitive conduct 
tstetHtcakh care expenditures (one-component behavioral model). He presents 
its from his one-c^^nt analysis in his October 22 report in this case, which incorporates by 
rence much of tZEaro* analysis in his July 28, 1998 reported submitted in die Maryland 
ion entitled "Proportion of Health-Care Spending Attributable to Tobacco Manufacturers’ Anti- 
Jg^npetitive Conduct; Sate of Maryland Medicaid Program, 1970-1997." Ahhough Dr. Harris 





r K*>;vi..;r • l-?!Sa. 5 *: >•' .r'yft ; ■**., v»» ; c ■- r'.'&rr-rrs *'■ •na-sridilff-V. r**’ ro 

recognizes the need to consider separately the effect of defendants' alleged misconduct "on 

:*• *A' n*rpas:d;rC.?&-‘ 




rr 


j 


Set, egvOct. ; 2ti, v l 995,‘“Notes eih Changes Over Time in the Me^ 1 l3urtBon"of Quitting*’ r " j 

FormerSmokers; Impact On 6, the Reduction m the Excess Risk cfQuitting."__^, rf - 
djv?*/ lausafinatawoM rv to^r^arf -»v»r. Moo*- no vrtwjdurf 

S**, Undated original report of Dr. Lauer bi this matter, entitled "Smoking and Coronary .. f i 
Health ^Disease 1 ' at 12M4?fiiid*ar*inpple^ 
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Iippw If 


_behavior and (he effect of that change in smoking behavior on the trusts' expenditures, the 

i 

I 

estimates he presents are neither reliable nor statistically valid. 

I ■-* 

i E^l-Ur|is' estimates, as an initial matter, critically depend upon his estimates of the factual 
wqrtd vahtes of M p. v "r,“ and his resulting eatiaaatea of the factual world health care costs incurred by 
the trustsfe b oc eu se of the very existence of smoking. Because those estimates, as discussed above. 


arc 

I 


unrtlLfaMjjgi' 


d invalid, so, t 


-rry-— 

the come^lta|phange in the hi 
reliable *nd vaJia. 



Dr. UpTTslnjrther anafys 


behavior. 


ith ca^^Ssts i 


statistical ly valid . Thofelnlfyses 

care costa incu 

: •* ■„>' .• ■ 

odd estimates, 
[ed; b) control foi 

een different i 

u>-‘. s.i. 





r. Harris’ estimates of the effect of defendants’ alleged 
;e in smoking behavior of the trusts' participants and 
penditures of the trusts io the counterftctua] world were 

effect of the defendants* alleged misconduct on smoking 
unterfactual world, additionally, are neither reliable nor 
dally none of the characteristics that are necessary to 

the trusts beca u se of defendants 1 alleged misconduct. As 

■. i to wiis, Mi ur- '.v i-r.«n 

is fails to: a) take into account the passage of time in the 
it background and other confounding characteristics; 

' smoking behavior that cut lead to different health care 


/1 «• » 


n; d) focus on the paitkipams in Ac plaintiff trusts; c) disaggregate assumptions 


les the assumptions to be assessed and varied individually to see the consequences 

. { f-»l ,r!ungot4*b^<93C!»hi isne;l:< .r?2sii?o».' 




ates; f) consider the possibility that smokers’ other health-related behaviors may 

'atttabmfcb to lasfi* lit yjaurcaqaa -»bi*n-» »i Ms.n »:i.' 

change m^counterfactual world; or g) employ appropriate statistical methods. '. 



I 


a subjective statement 


r*nu>oObtt* gakfewcT’ 

thout 


(V ft f, '* l 

**rrntel *«s*mA* 

world . 

<U*a&p-w:i i* *e*e»aO:siJ^ii&V 






: :* : - V<—- v - 

* V .*•%■*• .. v • ’ . .'c .,■ . t > •» . * • - j. ' ■* ,r •,* 

http://lej9acy.library.ucsf.efita>tiGtfpob|0^s$®ij|aotf.in^^ 


. •. • * V ; ‘ j\ i * : • *#■ ^ 


51956 9685 



I 

I 


First. Dr. Harris appears to conclude that, in the absence of the alleged misconduct, the 




tobacco companies would have introduced “safer" products to the marketplace. By this he appears 


tjnean both that products that never were introduced would have been introduced and would have 


r bc4n adopted by consumers, and that the "safer" products factually introduced would have been 

ps®®Sf > 


r 

' -‘roduced sooner. But in his reports in this case and in his Maryland report. Dr. Harris never 




iclftes: how any "safer" product would have been than products already on the 

; mi&cet at that time; tfw ^aielat'whidi co nsumers, 'and participantsin the plaintiff trusts in particular, 

p'i!T w !t ... 

jld have Adopted t^^jjjrwhai the effects of snatching, if any. to any of these “safer" cigarettes 


wduld have been on 


IW'j 

IPTOj 1 




i* health care expenditures over time. Nor could ho do so on deposition. 


-g., Hams Wis fonffion; Pep, at 396, 398. and 403; Harris Arizona Dep. at 18, 20, 297, 322-25, 

0m; ^lSl| 

pAatai 346-5^^ 



Second, Dr. 
acco industry wo 



ally concludes that, in the absence of the alleged misconduct, the 
ive disseminated information about health risks not then known to 


rmers, and this irifermaqon would have ihodtfied their behavior. It appears, however, that many 
JlS|or$ affect coiuuffilirtfecisions to begin to smoke, to quit smoking, or to switch to “safer" 
s. Accordir Surgeon General, smoking initiation is affected by parental smoking, 




influence, and a variety of other psychosocial risk factors. U.S. DHHS, Preventing Tobacco Use 


»<g Yatmg Ptopie, A Report oj^i Surgitan Central, CH. 4 (1994). See'atso Hama Washington 
25-36. Similarly, a number of Acton reportedly affect consumer choices among different 
rette brands that,by soma measures,'pose' different levels of health risks,’ as well as decisions by 
ejstin* wSib&Sii * fcH 3 J 
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hero/ (198!) at $-6; Redmond. W.H.. "I 
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Process," J, of Pub. Policy <& Marketing 15(0:87-97 (Spring 1996) at 87 citing U.S. DHH5, 

I ■ 

Reducing the Health Consequences of Smoking: 25 Years of Progress (1989). Yet, Dr. Harris 
in<*orpora|es nojmethodology to quantify the role those Actors play in consumer choices. He 
consequently cannot reliably assert the role that the industry's alleged misconduct played in consumer 


copsequer |My cann c 

choices, Li ' 

! 

; T &uffiw c 

i n 


I960, or 15 


ce, Dr, Harris obj 


fc for evcr-sxnol 


: any empiric 



counterfactuai world values for “p" by shifting back his 
alence from his factual world estimate of 1974 to 1955, 
itlon whatsoever. See, e.g. t Hams Washington Dep. at 


4^6-457. 

ni. dr. NT AND MI 
Demls^yr lv 
mi sconduc t on the trustF jfeaii car 
model) or splpflaity due to chan 


nt attempts to quantify the impact of defendants' alleged 
litures at all, either generally (one-component behavioral 
y mists' behavior (two-component behavioral model). . 


| Instc pri| ey exclusively at pmpt"^ address a caus a l question different from one involving 
defendants' aiygg| misconduct: W|^Pth care expenditures were incurred by the plaintiff trusts 
asja result c^^^ustence of ctgaret ^^^fe ng, regardless of defendants' alleged wrongdoing? .As .......... 

m result of to address this other question, their estimates of^ smokings attributable 

e>penditures ^evo i if all of the errors in their underlying analyses were.elirpiaated, would still 

T 1 » -Si 


*- 9 »f xm*** 


heald, |S5» duejo A. 

lyiea do not have the ossentuLcharocterfstica that Idoscril 

mvgvu. m-ip vw(UA o 


> aa gnciwuc; 


9^sam^moni y?™* 
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Dr. Dement end Mr. Roberta do not even model medical expenditures, but rather 
disease mortality. 

Their analyses da not attempt to control adequately for background and other 
confounding variables, other than gender and, in tome instances, broad age groups. 

Their reports distinguish only between current, former, and never smoking, 
disregarding d i f f e ren ces in these categories of smoking behavior that physicians and 
epidemiologist believe are important. 


jypns fitil to 


address the passage of time in the manner I described above. 




ions depend upon relative mortality estimates and smoking prevalence 
are derived from samples that are not representative of the populations 
trusts. • 


leave critics] assumptions implicit and not evaluated. For example., 
their reports do Dement and Roberts scientifically evaluate their 
hat the relative risk for mortality is the same as the relative risk for 


calculations consider the possibility that other health-related behaviors 

ing, illicit drug behavior) may be affected by the existence of smoking. 



CONCLUSId 
The plaintiff** 
expenditures tl 
(two-compo 


analyses do not even attempt to generate any estimates of the trusts* 
caused by the defendants* alleged misconduct on the behavior of the 
ioral model). 

Neither Dr. dement nor Mr. Roberts even attempt to generate any estimates of the trusts' 
expenditures that were caused by the defendants' alleged misconduct generally (one- 
lt8S |ponem behavioral model). Dr. Harris properly recognizes the need to consider separately the 
ect of defendants' alleged misconduct on smoking behavior and the effect of that change in 
sn oidng behavior on the trusts' health care costs. He attempts to estimate the effect of defendants' 
icged misconduct generally on the trusts’ expenditures, but his estimates are neither reliable nor 


> 


statistically valid. 
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Finally, all three plaintifb' experts' analyses attempt to estimate the medical expenditures of 
the) individuals in the trusts due to the existence of smoking. Even these calculations, however, 
ineorporat tally none of the characteristics that are critical to obtaining scientific, reliable, and 



statist! cailyvalid estimates. 

I afeco^rujing to address the issues surrounding the plaintiffs’ expert reports, especially Dr. 


HaUis* Oc^^& 
! 

3 


report and his 


report in t| 



November 6 report, and therefore may supplement my 


N< vember 
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Advisor: Professor Anthony G. Derringer 


I arvard University' 
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DONALD B. RUBIN 


Other Profession*] Positions: 


1983-Pnssenr 


[982-1984 


H 


1981 -Pits eat 


1980-1982 




1971-1974 


1971-1975 


1970 - 19(71 


Research Associate, NORC, Chicago, Illinois. 


Professor, Department of Statistics and Department of Education, The University 
of Chicago, Illinois. 


Visiting Professor (February, April), Department of Mathematics, University of 
Texas at Austin. 


Professor, Mathematics Research Center, University of Wisconsin at 
mon. 


ijggj} 


it, Daumeuics Research, Inc., Waban, Massachusetts. 


0 $ (December. January), Division of Statistics and Applied Mathematics, 
b#f Radiation Programs, U.S. Environmental Protection. Agency, 
metan, D.C. (Senior Executive Service 4). 


,98 5Lt — 


fishing Fellow, Center for Applied Statistics, University of Lancaster, 


tistica] Research Advisor, Educational Testing Service, Princeton, New 



tsiting Fellow, Center for Applied Statistics, University of Lancaster, 


lecturer (Spring Term), Department of Statistics, Princeton University, 
t. New Jersey. . 


Lecturer (Spring Term). DcpanmentVf'Statistics,jiarvard.Uplveaity, 
fe, Massachusetts. .. . 


Visiting Associate Professor (Winter' and Spring Quarters). Department of 
Applied Statistics, University of Minnesota, St. Paul, Minnesota. - 

:vjir»YinU fc ^ 

Chairman. Statistics Group, Educational Testing Service, Princeton. New Jersey. \p-\ 

.tr'iei.'ciii- io :r,r-,Tinco»0 .masaftes 1 !; 

Visiting Scholar (Winter Quarter), Department of Statistics, University of . J 
California at Berkeley. la jnwsmsaq. 


*““* h ^aasss»Hysw 




«g 


Lect u rer on Statistics, Department of Statistics, H arvard U niyerslty, Cernbrld ge, mr i j-aaiK. 
Massachusetts. ' 
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1969-1970 


Cr 



Research Assistant for Professor William G. Cochran, Department of Statistics, 
Harvard University, Cambridge, Massachusetts. 


Teaching Fellow In Statistics for Professor Frederick Mosteller, Research 
Assistant for Professor Frederick Mosteller, Department of Statistics, Harvard 
University, Cambridge, Massachusetts. 


Statistical consultant on various projects for governmental agencies, research 
organizations, and university faculty. Details available upon request. 

ggiUlfA Assistant for Professor Lawrence Stolurow, Computer Aided Instruction 
Laboratory, Harvard University, Cambridge, Massachusetts. _ . 



Fellow in Computer Science for Mr. George'Mealey; Department of 
Ap plied Mathematics. Harvard University, Cambridge, Massachusetts. 

Assistant for Professor Lawrence A. Fervin (Summer}, Department of 
logy, Princeton University, Princeton, New Jersey. 


jh Assistant for Professor John W. Tukey (Summer), Department of 
emetics, Princeton University, Princeton, New Jersey. 



wahip (Summer) - Research with Professor Lawrence A. Pervtn, 
em of Psychology, Princeton University, Princeton, New Jersey. 


lowship (Summer) - Research with Professor Lawrence A. Perrin, 
em of Psychology, Princeton University. Princeton, New Jersey. 


t Technician (Summer), Armour Research Foundation, Illinois Institute 
ology, Chicago, Illinois. . 


Technician (Summer). Armour Research Foundation, Illinois Institute 
of Technology Chicago/Illioois, - - 


Prc|4^0i»ml Activities 


1975-i 


1975- 


979 


9S0 


.KMuuetc no twMi' . vnr.ai~* • * • •■v-«. • 

Associate Editor, Theory and Methods. The Journal of the American Statistical 

Association- 

& nr rr-e:-? tihrl.t'-'' w'*rnr>< "*~ r - “ ■ ' 

Associate Editor, Book .Reviews,: The' Journal of the > American Statistical 
Association. 

- - • , vo lf > ho~ h;^ vg*~- smb'crc* 

Organizational Representative for die Institute of Mathematical Statistics at 
Educational Tes^Ssa^U^iiss^MM ve* emwir.i .iwdin^M iiofthaD 
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Y" &£. Beaton and J. Barone). 

I97§^ffl$MultivarUte Matching Methods that are Equal Percent Bias Reducing, I: Some 
r Exj uoples.* Biometrics. 22. 1, PP- 109-120. Printer** correction note p. 955, 

llijiliiif 
r ™ 

1976- variate Matching Methods that are Equal Percent Bias Reducing, U: Maximum* 

)iu Reduction fo£<Rx«d Sample Size*.' Biometrics. 22. I, pp. 121*132. Printer'* 
tcctlonnote p. 955-Ail _ >v ^.. ...... 

^7^^^Miparlng Regression* Vftea Some Predictor .Values are Missing/ Technometrics. 

\ ^,.pp.;2pif2os. . ... . . .. . . • 
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expense*. K, for 1individuals with given 
background daa io g r p phic characteristics, X ether 
healt^relaicd foctect, 0. and racking behavior, S, U 
the wac as is the actual world. It Implies that the 

B relative risks of medical rxprnerr for various nsUaf 
befaafion, S, for individuals with gives X and D ate 
in the eoc ntetfoctu t l and actual worlds, 
i i appears to be always mtia cun ad In 
Aloe, and so we consider later what need s to 
stake it plausible. 

snpdon 2 satis that the conditional 
of ether health* re La end factors, D, fa the 
ral world tor individuals with given 
lc charade risdrs, X M the same as. la the 
®yjFvn>rldi. A«tunpcionfel k (ypicaHy 

we allow modlflc g^p y it.. In gcneal, an 
aSterpSlvt to the aecen^^u^iptioa defines’ the 
■l|ul distribution (t.ewpte«hience) of D given X 



; In the counitrfoauj 
sal, an alternative# 
sc cood l licaai disci 
; behavior (S) in they 
Ahie/backgrouad dp 
Sited fa^a 

Carving sfes^ PoUnl 
u table J rad tan r an# 


(gjDolL 
teat an 


there ctan no 
tad so defines IL 
third assumption 
Lc, prevalence) of 
actual world given 
lies C70 and other 
dlvfduals in the 


T7r»ag Smoking 
X Prevaleaeaa 


dat Assumptions 
Holly in Section 3^ 
wodd relative tel 


; attributable fractle 
irtd, Z-0 iadlaiel 
PBofctnz.' and 2-2 


attributable 


ruphic characrcristio. X 
tors. D: 

SAI^CXD). 


& 2. both Mated 
ifeas expenditures in 
iifiirnal world can 
| concept of the 
| Z-l indicate the 
unteriactuaj world 
I-etc, generioHy 
nds., . ; -Then the 
Jthe actual world 
In matrix fimctlen 
and other-health* 


p-rlisMy, SAf^pCi DJ.Tfe •a twoway anaiy. jLe^ 
Sif«fc-ec*k>asnc esehrgJue.ofPC D):.ochruw 
sponds to a component or typo of health care east ■ 
. Medkaldcneta for treating lung cancan, private 


for thane individuals with specific values of (X O) la¬ 
the actual world, POTpC, D). . Taking each eniry ef 
5AT*(X D) times the corresponding entry of 
POT (X D) and summing over all values of (X O) In 
the population gives the two-way matrix ef excess costs 
for components (or types) of health care cam aad yean 
in the period; tha expression U given by Equation (t). 

Under the fim two assumptions. SAP*(X D) 
can be expressed as a function o£ 

« the prevalences of various smoking behaviors in 
the actual world and in the counterfoctuai Z world, 
as /Unctions of demographic characteristics, X, 
and other health-related footers. D, 


* the relative rids in the actual world of (ha various 
. ... smoking behaviors S relative to ao exposure to 
smoking, also as foncticn* of (X B). . . 

This expression for SAF*(X D) emphasises (hat under 
Assumptions 1 and 2, til that wc need to sped# to 
define the counterfoctuai worid Is rite set of prevalences 
ef ail relevant smoking behaviors as ft m ed oru ef (X 
O) is that counterfoeataJ world. Also, this expression. 
Equation (10) or the equivalent. Equation (11), ehows 
* that the correct formulation Is relatively s ubtl e (e^. 
Involving tha dififereace ef prevalmeet in tha actual 
and counterfoctuai worlds, in its numerator). An 
expression for SAF*(X D) m a t h aa s aS caUy equivalent 
to (11). Equation (13) (or. Equation (14). which U 
parallel to (10)). involves the actual. worid . relative „ 
risks and prevalences of smoking behxvton and the 
relative frequencies of (hose smoking behsvices fat the 
counterfoctuai world, all as fonctiem of (X D)* . 

1.4 Tasks _ ... ■ * 

Consequently, A under^Assumption*.C,2, *nd 3 
- (Le_ Z - 0 world with no smoitaag), the tesk.is to .. 
estimate the relative risks of various amoidag behaviors 
and their p rev alen ces in the actual world, as wall as the 

pets of expenditures .in/the .acbtel tv w«rid.^aU as.. . 
functions ofXandXCVmfe Aswuqnteri^V^l,^ 
dropping Assumptions, jbef dffffo sl taskis 

.ji 1 * PsskU the refsdve.prc«J’«Kt In.the e&tnteffoctnai 
,, _ world; (versus ‘the 'actual world) c< smoking 

behaviors jrveaCXD). .. *_• 


ranca for amriiK 1 homes).- nnd k >each ... , .••j*;, 

spoodi to a'tlycarilnlsrhe ,!tlm ^period* jmdet: ^l/nder Assumption i, but dropping Assumption 2 In 
[deration. Corresponding ■ ao; fol*rino-wqyr-WSy^ jtddUtoa m Assumption 3. the nddUfamal task Is to 

: Is a two-way troy of actual costs (pots of dollars) 
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« bam (he relative prevalence of other health-related 
fronts. D, la.the eounterfctusl world (relative to 
tiie actual wo rid) for Individual* with specific 
Values of beekgroaad chenctcrissicsj X 

This' additional tpoeUlcatlow Implies a *behev(or 
reodUSed smoking attributable fraction*, which is a 
ftuu^iett qF' SA5^(X, D) and the posited relative 
prcvalcocnii cf D _|n the couatcr&ctual world Z. This 
behavior mSg^0S smoking aoributable ftaesfon feta 
applied to foe poo of expenditures (s Equation (Id). 

| Axt umptoari 1 is maintained throughout Ways 
to make ltporefteasa>le an d is cu ssed in Sections 3 
and kspeewfr Section, d. 


Document 



I The mm&m* of tWi -aoewnei*-*. 
statistical 'definitfon* tad jtatlfiea|qiS 
coixriurion* ^ ti i overview. - As pM 
throughout y^^doaimcat. the toucan 
estimate dif^H^is and aaotma in 
an wslcaily ignored. ' I; b tfixtlveijpBBi 
these out be gaga mi ted through proper data 
efiojts in the ftweariate population. fUM 
minjmiu the fygo ifty of dg fo jrt hl* c %j|g 
problems of jBt&fng (K llstribifiil 
nanitaoksrSpaBB^Miiotg sroqjSi? 

\ —r*“* by propensity scgrp»jtfy**> mi 
andjiequlreMntha for estimatloivflne 
longinirUiul^ ^dp er Incorporating t#pigg 

patterns of smimng''' behavior ta dtu^t 
outcomes in ti l§| eta also c o mp lex.' Bu ksb 
an purely dual! ftoin^he' banotptuaPai 
asstynptloas n ffedj^j to define‘enuruafoere 
witlgjut the m &g ti i misconduct " so,tlu 


2, Notation 

2.3 Basic Quantities 

Wc begin by defining two simple quantities la 
the actual world and an Indicator for which world Is 
belag considered. 

N ■ number of Individuals In the population 
during the lime period under consideration 
G.e., alive at one dme or another during these 
yean - this allows for births and deaths — N 
b the number who were alive at some time; by 
definition, the value of a measur emen t for 
someone who Is not alive Is some arbitrary 
value such as • — the point b.~sll jueh 
individuals ere la priaeiplc IneludaQr*^' 

v • -»•.*V. *tU«> M-l i) .. 

X" background * 1 and demo gra phic* •*• factor* 

generally varying bom ^Individual ^to * 
' isdividnal but unaffected by the ex i sten c e or 
not of alleged' mlaeondaa (£g^ ’binhdete, 
sex, race). In' general. X b a matrix with 
many foetors (each factor b rep res en ted by s ; 
row) and years under consideration (each year 
Is rep r e sent ed by a column); for example, - 
education (a row). If considered to be - 
unaffected by alleged misconduct, would 
generally vary In time (across columns). _ 




alleged misconduct indicator'' 


expjodJfura e 

interest and J 
cougtedhetum 
of exc es s cm 
coujuerfoctaal 

this 1 tetHudi 

Implicitly «a 
pmaoudy, wt 


' actual world * ~ 

. u.>ionaeDri.sc&s« 


■z-o 


l :■< V- .vwuSi'-i 1*3 

4 *y f • *w 


akularrri 

eats notation for th 
ir dJsafbudctu In i 
i. Section 3 conk 
tire actual world 
wtiteutlhe esdstea 
'current" practice 



comuofocsaal v*St*without the existence of smnlrfitp ' 2 - 3 =» - i cobnseiaeattl^www ^ 

this 1 estiaatin(tei^^ol^ent :, pl•3^ea^ * tafav * k ptaca ; ' -without some other»**pec& : ‘Of 

Implicitly uadfo the three'assumptions' presented - alleged misconduct. -* s«~ 

pmaourfy, whfi^^dlhcrB'nxle'taaUetL‘ Section'd'"'';- 1 '.•*' 

bx tqedi thi i rr)ixMlrt|in tf!lflnii ? inin>*fn(nlrliTftirhe'iirtr '• ‘ ' °'tTC. .(QJSJ -AB 

la Ue eaunrer^to^amridHi^ rtfl»^Aj«Buptldo”i r ~' i ** *Jjwj 
3). iund Seetio^P’flRher allows the mmierfeetal We now define outcomes tta^are-fantofoos-wf'Z; 
world to tare same' Vv odifl rift lons In' other "health-^ : generally vary froitTindlvidual fo .l»lWdoalJ^in >tee^w«^ 
related foetors <Le^'fc'' 1 e&sd 5, rele*ef Aimmptfoa 2)T nmltkompooerU (r^) erf >«ry WtinfoT^ ial nSM«5) 1 *? * 
SecAon < diecusecs the crucial A ss umption 1 and how 4, '° Jmsunee nred g fuusru-yntmmackjma 

to nudes It plausible. SCQ- smoking behavior In w uri dZ; amn i mnr . leopfore^vtC 

j .« ■* ,ft«Kir,uf*a.^ *i "-sJrusuraA ctpo shl ^ . pr and 


■ a'counterfoehial '‘wo>ld ;: wlthdut ' : - 

the cshtchca of smoking, -- 

j •jt.^sr.” r-•. .Ji'rwr r. ■ •- 

-a second'eounterfooual-world ' 
without some aspects of aBeged ?-*< ’•?-. 

•• ’**** •*'V 

.v«- mu .-.1 4- -cs^'trti ^jAif.unt 

' a -third •' cotywrfocaal'-worid tiauov sti 
wtihoat aome other>«flpectt : ‘OfV“' J ? :; ^ ! ^^ 0 
alleged misconduct. ■» wse amum 


- . V. 

We now define outcomes^thai*am : B»ed*M^of’2;hll'i 
generally vary from~iadMihstI fo -teawdonl^an itm 1 ^*^ 
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festally inetude more characteristics sad non 
bucrectioo tcrme). - T A*-jwith standard practice, we 
maintain Assumption 1 throughout and so relyon (X, 
D. 5) being ladudw enough to make It plausible. 

}J | Assumption 3: Other Health-Related Bcfeav- 
Ion DUtriboliea 


mrntion li f£<DpO- f^(D|30. 

{ The conditional distribution of other health* 
i related fsaori, D, (« the courtsriactual world 
for Individual* with background 
characteristics X 1* the rente a* this 
distribution la tfcujcoai world. 



inapdea 2 Is really t 
hats the first . I 
llereamples to.MUI l night include bad. 
[lecher drugs used, stress, «U of which 

Id be worn, in the aGucqgieadlnf to worse health 
*tne* In a coua ssrtS^M^orid tn which tobacco 
jwea reduced. Alliliiijpuuspie is Medicaid 
IbUity hsci4 whldfa could change in a 
igrfietual world, fe rg^m^ le. UT fhtty diets led to 
f 'ban d jenase . or If yAhce d smoking led to longer 
^wlth taonmft, aren^Wid age. 
j As mw^X vxplgcs are Included .In D, 
rnotm yf l lfomrr iM^ That 

if'example. If ^duded in©, then the 

that 'diet* might bgpSHIgnt in a cnunterthctual 
1Z than in the scaww od d b of no.consequence 
sumption 2, whcre ^ afwet is a component of D. 

a critic out clajttrrH could change la a 
lerihctuai world bedtosr'die: could change. .Huts, _ 
iks Assumption 2 m ost ptot^ ic.,nny,yirinbia that 
t be affected by the Piliin ritual world should be ., 
ed. But leavtoi o utaochJi 'verlabUxTBates street 
aam n pti oa 1, - whlrj^ l wl w^ swu jilatwfli i lc as^ 

and more *"*"• fo-'PT** ■ 

b.tosukethtaaceediralnptiponwstjwiBd.arv _ 
doral* betor-^rhat-couldqcb^' afftsaod^by^. a 
erfactaal world without; dhe;*«lJ«gs4ijaiBWoiuet 
d be kfi •utofrConsWecttio.th^hHSW tp^hake.l_ ; 

m *tu4 k aW wti fa ffdKrtitM •>*»■ Jst ^U fA uj L Rttii? bm lift ^ 

dw eoastdartetarfag AawTgp tlon 4 ,ifie .demand. 
dug Inclusive la the definition of S b strung., 
lug Assumpdee 2 h e a r t i d e re d in Section 5. '' ~ " 

(Q Jt, It)*."' 

AreuasptfaaSs'A Werld Wltit^tLC'reowag „ 


waypsyche logical one. 


mr 'example. If *dkt* 
Ijthai -diet' might l 
rSi Z than in the so 
ptsumplioo 2, where 
a: a critic tarn eh 
pterihetual world bst 
Bake Ass um ption 2 n 
At be affected by the 
bred. But leaving eu 
Amimpuon l,whk 
|e and more toasts 
lb. to make fits face 


fJ(S(X.D)“ 1 ITS reflects no imeking, $-0=ofG 
0 otherwise 

9 la the cauntcrlhctual world, ail smoking U 
gone for everyone; this U the Z • 0 world. 

We relax Assumption 3 In Section 4. 

The tint and third assumptions together imply 
that the conditional distribution of health expend Coro 
in the cnunterficfual Z - 0 world without smoking far 
ladivlduab la tiu population with given X and 0(0), Is 
the same as the conditional distribution ‘of health 
ex pe n d itu res in the actual world far Jndivlduab In the 
population with the tame values of X and D(i) who 
were never exposed to «ny cigarette tmnkJng. 

X5 Using the "Smoking Attributable Fraction" 

A Optical step In current madding hr to rewrite 
the dlfifcrtace, H(l) • ST(0), using an repression 
involving the ‘smoking attributable fraction* 
comparing the actual world to the Z “ 0 world as a 
function of X and DO) to the population. Using 
componentwise matrix division and multiplication and 
standard natation for conditional expectations, wo 
define 


SAf\X,D)«l 


EtHO)ix,ocn.s«<n 

EIBCtMX r DO)i 


Then we hare under Assumptions t • 3, 

_ B(l>-B(0)-JtrSAF , (X.D)PO'nXD)dDdX, 


PCTPC.D) - N iilK(l)l X. DJ • f , 0 CD|X)f - (X) CO 

the H(l} dollars (in the component \j -' 
time matrix H(l)) in the *pot* defined " 

V.. ,* by given values of X and D,. 

.Jard^fOdD dX - r ihe sum over nil Values of (Xj D) la - ^ 
the actual worid. 

, v. t !. Equation show* why SAF*(X. D) ptoyistmh 

a major rob la current tiiseuaslonf comparing nusts'to f 
the actual world to ; COSta In acourtertoetnsl wood 
,without sm ok in g. Equation (S) ays to take totburb^'' 

nndatutoute tha'p^ba'iBniinirtWa cxistenc* Vas 


*' 5 1^V 
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at smoking for those Individuals with those values of 
(X li}, and than doUar* up ever <11 values of 

(X, D) lo the population. 

■ —v* quasar 5Af*(XDj can be written in terns 
of actual world prevalences and relative risks of 
smoking behaviors. Section 4 doea this ia ■ more 
gcaeal context. 

4. A»ot 4»g CflM terfactuai Worlds With Some 
Seno dag BurwSimlslnlag Assumptions 1 and 2 

4.1 Uriii Jgggaa^ n Count*rfactual Worlds That 
Add tu tbAJCIfan of the Alleged Misconduct by 
Alloying SooMlSmdhJnc , 


fixed X, D) os a ret of prevalences, p'<S, X, O) and 
v\s; X D). respectively. Indexed by S and summing to 
one acton S at each value of (X D). Corresponding to 
the sets p 1 and p £ Sat prevalences of smoking 
behaviors, we have the act of relative risks far each 
type of smoking behavior to the actual world, that U, 
relative to no pocking, also Indexed by S: 


AS;X.D)- 


E[H(1)IXD, S) 
£[H(l)l X D, S - 


; Let ui 
and B. bat 
tUey son 


such! tadl vide 


m Sot now to accept t 
mumpdon 2, which 
&pf . in comurrftt 
m SAf^CXD) to be ( 
expenditure*. for tad 
Eg in the countsr ftctu 
vc actual world:.. 

tyixofiisirfcaixo 


Pponx 1 
Beetled to 

I %)rids. 
lainps the ’ 
~£m with 

to 


r*(S; X, DJ In Equation (9) Is the relative risk tor 
stocking behavior, S. far those with specified values of 
X and D in the population. 

■ We now can writs SAF*(X, D) in terms 
of p’CS; X D>. p*(5S X D% end r*(S; X D): 


S*f*CX0) 


rO-t.-w pr^ i — 

fl g“TT 

i ’f Is J?^p®brvrirtl to*iii®ihat 


sartxo- 


Z VtKXO)-*>*(* 

Z a’(StXO)*r‘(S:'xtD ’ 


l/<s: x 0 ) • f*cr, x 0)1 • 1^(1: x oj ■ t J 


i 'r\sm 
placg of SA 
Asnjmpdonj^ 
H(Oj. That 6, 
rethff chan SA1 


ifbnvirri xg rW e rk hAt uj 
In EqfeipaNP) ] 
yt, H(i) - H(Z> in pi 
O) to carve 4 
CD) and add over all C 

^tXDjPOTC^Dl'J 


, SAf* in 


Z ft*: X 0) •f'(S:X 0)1 ♦ t • Z »*!»; X O) 
s*e *»e 

where the summadons ta Equation (11) are over the 
positive smoking behaviors. Special forms of (10) and 
(11) yield standard expressions for relative risk. 

O How to Specify Prevalences in Coonterfictual - - 
"WortdZ ' ■ i ' < ~ •••• 


fi(51XD);ij 
and|2 assert tl 
(LtJ f* and 


bnplcment Equad 
2 Is a pod tod d 
a the couBics&et 
in for this is that < 
' distributions dm 


Whit is oecdes? W implement E^iad qq (t) J inder The critical quantities in Equation (11) yr the ; 

Assumptions iPW§ 2 Is a pod tod dlpfjiSB wa for prevalences of different types of smoking behaviors In 

behzytBfgin the cmmtcrfbc nul wadd Z, the eounrerfacttol world ns a fonctioo of background^ v 
f*tsix DV. to^spue n for this la that /lllijlfcng 1 efcancterisOes X and other health-related foctnrt O •- t ■ - 
aad|2 assert znttrxtmp distributions Idad^ttAfe (which id this section are assumed to be unaffected by 

(Lajftand fSHlre the cum in the Joume^iemd the alleged mls eo wfaci — Assumption 2). • - • - v ' r ‘- *''’v 

and hemal wori*i^\T - * la some cues It may be caster to thlnk abounh* ~ 

j ''./ ’ r "\"T way the alleged misconduct would modify actual world : 

4.2? SAT* A f a > ftmctiinPaif *Frgva>cace ind prevgimfles.es a Amciloft of'X ««d D. nlherUan tUnk 

|u4dvc Klik M There are Olfleretti Types of directly about ** the. * eeunttrihetud - prcvalem** i»> 

SmOUag EchodljMUU.Msisuateliig Assumpriou 1 ..themselves. To dd this.-define k*(S; X"D)es ttariilto.ua* snta, 
an3l iZXr" -- " c prernience of w«eln* behjfd« ff<i«^^orei«*rfhcwil« hec artesrt 

4 ^ ,ga ca ‘ i - * *> to- p t i v a ie n c c . In-UN'CetMl- wo*W 

_iJjZyZi; - nf,TX?4r..f' 1 y < - • • ~ r - -~ ^ .^ranr.z^. gricultt-. 

ag aeribi^bte^fi«^^>wy«.to_ arenwactud ,, v 

Z. under Asma^aonsJ enqX ....... . .fi2i 4 

Tha 7~T 3r fc*(S;XD)- r •- 1 r -• ‘ .'sr..V*rr«A :.-X 

Si* 5 *p'ffixw • 4 


rf' 4 *- 
Tauian ii 




«»s* it wt> iU 


p'(S;XO) 


^^2) r «j.v*i'rWtA • : «-.C 




'C 


' • , . , ' .. *' yiaxn £A »rrr.*~***™ r ~ * : 
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where. presumably, X, D) jf i for S > 0. and 
k^C; X DJ i U thereby I n dica tin g tui smoking in 
tli* eoutuerfaeioel world than ia the actual world. 

i 

| Then Equation (11) becomes; 


g*C>. X) - 


feCDJX) 




<W) 



X O) • lr*(a; X 6) • IJ • (J • *?(3: X 0)1 


.. 03) 




z 

s*e 


i'ftXO) 




CIO. 


ic of a partial 
(hat for the 
D; S « max) ■ l, 
ao outer what 
behaviors. 

In 
isca 

bya 
certain 

a certain 
also on Income 
Involving Jc* 
f pq ct& Cttloo of 

justification of 




S ulfentlaa «f 
noting and 
J eh avion But Still 

p&l&ssaf AsfMnpti 
acaiib-tuimd ■ 

.-.-v:. j: wn 




for ires: K DJ 
5 “ max 
t the heaviest _ 
or ethers 
maadmom 
world 2 
k*. for frmalrs 
bodes* to one 
could be 
Qr k* could- 
levels. The 

amenable to . 

the 


be the prevalence of health-related factor D (for people 
with background characteristics X In the popu l at i on) la 
ccwmtfrfactual world Z relative to the prevalence of 
that health-related factor in the actual world; a 
restriction Is that, at eaehX 

Ifa; X) fJ,(D|X) OO - 1. 

because the prevalences, of health-related Aston must 
Integrate to l (if discrete behaviors are being 
considered, (he Integral is replaced by a summation)., 

. ' 

S.2 

- • "Behavior 
...-...Fraction., 


Expenditure* 
Modlflad" 


. « • *» ' 

Obtained, JJstag a 
■ Attributable 




Under Assumption 1, the excess health expense* 
la the actual world relative to the countetCt cnal world 
can be expressed as ~ . 


WD-HO)- 

Xf renxojli-e'tojjoo •w 4 <xe»jaDdX 


<W) 


Thus, the expression la square brackets, which equals 
5AI*CX *>) when g*(E>; X) « i, can bo thought of u a 
"behavior modi Bed smoking -attributable fraction*. 

. Hence, all that are needed In Equation (Id) to allow 
?; haltit-ntfatcd behavior modification in couatcrfoctual 
~ ■world ut(still maintaining Assumption 1} are; > r-.-r- : r 


■o»i — •-* 




' *W> ;t*?- • • - 

The pots af expenditures in the i cntal world, 
: POTCX D), as specified by Equation (6). and 


> , 


Worlds That 
Other Health' 

In Assumption 1 .. r *r - - ' ’ . ' 

Svi-rit-cV-* The smoking attributable. fraction. SAP(X, D) 

lowing Changes..; te in counterfaouai wertdZ, as expressed by the 
— — y* r f>jiraqu|vslaat-Equations; (10), (l oc (Ifl, 

W rMiiT’-ri.t?' rrtbwhteh are determined by:,,... . : te::-*:*** ■ 

14* **«ivma S- 3J&hr- v r* 


14* **«ivma 3*&itF' V " - .* ". -“!! * ‘ v.ta: 

t> <2i)The aew*l ; , world relative risks of .tUfferent 


Section 4 we showed how to>fomult»lhe- 
i unikr Assumptions’! tnd'2 T 'htft reia9tiAg - . __,— . 

1 3, which dalracd theep u n t erfocnaiarerido a.- u V itsowpklng bchayiotX"T(S; X. D), -j c r 

fcsmokingtapoAtrscfottykind; t h ereb y aDowingru Ttiasitwr- * «• £ o.-sisti-ivm r--n vv/u 

Blffirniil T-rrriit Ytimf wiu« ,*xuj»K2b)The actual njoorid «ctevakn«S r j< 

Li^isasOcw!" s Himsityn tieirbhribiv newt) fcwldj w l dn gbc he v i o f l, p.(S; Xi-TOret itV-r-. '•.'..’.■uV-.r ;t. 

J We now proceed to fewnUptfMhif'ifiajaewQtfcv-s: or,j:ii»a«*errr-- uw-lnmav;; *ri 1 *» sv/— -i xr-.j? riv.'-*' .- 

furtier, by relasdcg^Assumption 2 w allow chaagesiarrf v *lsv (1c) The eouateffoctua!^wojW gdatirejwevtfea^ : : ‘ - - 

other health-related &c«n lA"a>uunUtf«misl worldsuihof- thw-smoklng heh»im 4f(K.x. nfc.,-; r . 
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In. 


(3) Tbc eouoieifactuaJ world relative prevalences cf 
cchcr healt h r t l ttrrl bchivico, ^(O*. X> given by 


6.1 


Further Diacusaloe o t Assumption 1 

General Comments 


i 



_ is designed to be like i "law of 
parity*. to bold more or lea 
exactly threjt jhout t he entire population end no tracer 
what happeiSsPSi^ emmterfiuetuil world. It provide 
the relative kaic (in dollars of medical costs for types of 


medteal 
function 
hca^li* 
actual 
modal 
junjtablfc 
doriocper 
the 

£X.tf. S). 



our smoking behiMon, S, as a 
cbaractcrUtiet, ^Nap^y other 

To the exteMlil rla the 
i on be used m predict ttrito 
this assumption fc o a g e l more 
If la (he icual * 6ciiJQ£jp . S) 
t K, then them is wKsoattjSeofet 
; vsrtniofj (a K. un g j S^gp ie by 
due'to unpredic 

variation rather than alient emitted foemiff^F&ct, if 
the Tclntlooshi ^teyt ea H and (X P. y^g^ M In 
observable subjj^p3a||^tu » the actual is 

strong evideac&gifotast the *ipg|teil pav||^MI^n of 
the (oodel a I T^g~" «°ly «| *^p. S). 

• More flliaffeillr. l eTusl suopf 
estimate thedlstributl^KlXl 
Aftr^^nt pop^a wfaed by some footer i 
at ljHCDlX OJ for G - L 2. 

indfontes ModiG^Nri^iUty). If we 
answers (Le., tsquwn relative risks) 
then dourly the^po^l ignoring 0 1 
parity — it, wtsdty depends on G, 



itionlitg only on 
|w. and there may be p 
i to a eon 
I order to try m i 
sodh a factor O! 
iadkatet region 
the model as ta X variable. If 
[ efigfbilhy.'then tndude It In (he ■ 
If G Indicates a type of •• 
i Include It law S as a medially 
■noUap^f Inctodise r k 4n -the - 
I generally nans not just 4ax 


evidence is 
not ^jve a univer 
believe It will j 

I Tbct t 

parity law, Is i 
1 -For estaxnfi 
satf, then Indu 
G indicates 
model as a 
splint 


magical expendli 

an extra tens, but also to fa e j ud e all Its i n teract ions 
with (X. D, S) vkiblee alreadytathe m S dri .arortw -U-Vw 
7 Of course, Cved wbcsVf'lndude cxtn foctoiX'- : 
we pill do not know that we have aTaw bf-medleal : n 
paytQT for the expend! ru re model, which generate* 
the relative risks cf smotdag behavior, but at least once 
am nave Included all possible variables we do not have 


direct evidence agajhri ft. Thus, the mare X and D 
characteiiiticj'wc include; the mere comfortable we are 
with the ‘medical law of gravity" Int erpreta tion. But 
as we add more and more D factors, the burden of 
spccuying the factor g*(D: X) and Us relationship with 
the factor k*tS; X, D) becomes more difficult 

Another possible justification for Assumption 1, 
even without nearly perfect prediction of H from (X D. 

5) in the actual world, is juggested by the typical 
modeling of expenditure distributions, which we now 
discuss. 

(J A Decomposition of Assumption 1 -rw A "Local 
Medial Law of Gravity" 

The typkai modeling of the distribution of .. 

‘ expenditures given predictors proceeds Uutvrj pasts: - 
first; the probability of a positive expenditure is 
predicted using all the individual* (often using t logit 
or prabit spcdflo<loa);-thea' condition logon the fog 
that expenditures are positive — that is, looking only at 
the Individuals with positive'expenditures, the amount 
of (he expenditures l s predicted (efta using a normal 
Unaar model lo predict log dollars). In acme 
situations, we may have very good predictors of the 
existence of positive expenditures; but las good 
predictors of the level of expenditures. Even though 
tho second port of (he modal dees not yield precise 
predictions, nevertheless we msy be fiririy satisfied 
wish (he genera Hr* W Key of (he foil model because the 
second pan is for a re stri cted subpopulation, La. those 
with positive expenditures who may have much awn .: 

■ homogeacoos values of predictor .variables than, do j 
those In the full population. 

■ More specifically, suppose (hat in the actual 
world we have an excellent pr edict or of the existence 
of seme positive expenditures in H, »/ the component 
cr D that indicates Medicaid eligibility. D*. Having D, 
available greatly increases .the predictability ef .tfce 
ejrixrenea of many types of medical ta p mart In H. and . 
therefore contributes to the belief in a 'medial law of 
gravity" for this first pea of the modal, iw.-r.;- » ■■ » ■ 

.. hfow consider the neoond -pasuof the.Jtmdd. 

-Even IT it is dUfleabrn predict ac cu r atel y die exact ; 
expenditure amount s|thoae.individuals with „nmh ..t in. 

^expenditures, wcmuy-stlQ-feeUey* that Ass umpt ion- I.;.j<ct. 
-applied to these IndlviduaU reptesetus a.'local medial 
expenditure law* because the. assumption Is focused on - s y. 

relatively KoRiogcneoas eubp o p utadon.- with .some . 

medial expenditures aad ralativciy rimllaf (X.^t£)^/-i^;j!i^ 
^characteristics. .Loetily. in' this eub pop uU tlo n . ( X,.P« - ^ . 

S) may be *ufllefam:|>redHe*oapf.ibe«al mgxn^m . ;j 
in the sense that fade oTwedlctsbUlty may be 
essentially random events, ' jaj 


V 
■;j t 
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fj The Demand oe D 

.."-at la dear from the discussion la Sections 6.1 
sod 6.2 1* that them U a need tor (X, D, S) to explain 
■tele of medical expenditures (a the 
I In oodcr ter Assumption 1 to be plausible. 
Um background characteristics X U realty 
problem, although there are issues cf 
of detailed Information In real data seta. 

; 5 to Include all medically relevant smoking 
? may be an eflbn, cspeclaJSy with res pect to 
real data, but It does not lead to conceptual 
c|n|pSfS4as. Za cortfmt. i^pcpsndla g D sot only 
fpmlll^auAds on data aoureestjrefe^reovetv creates 
complications pjne |<mq d fat them 
b the oocd. to ^Bedfr bow the 
8n«lD (e.g,D*-^e^ca^eUtlbility) could 
a eountarthctual ijBatfced thoot the alleged 




re BOW difeun 

y i' lQ | 

I to estimate exeare 


fj (SI X. D) • rf (S| X. D) » rj (31 (a. d) • (SI (x_d>. 

Under Assumption* l and 2, Actors (y, e> can be 
Ignored. That is, under Assumptions I and 2, Actors 
in (X, D) that may be nbvut to actual world medical 
expenditures (Lt, actual world relative risks o£ 
smoking} but do net a ffe ct the difftrtnea of smoking 
prevalences in the sexual and counterfbctual worlds 
(C-g.. affect neither actual world nor countexftctual 
world prevalences) can be Ignored when calculating 
excess medical expenditures E(l) • H(Z). 

4.S What NEEDS to be Included In (X, D) Coder 
Assaaptlea X? 

- Most generally, we' now only maintain 
Assumption l. The decomposition new Involves only 
X - (x. 7 ) and is such that the difference In Joint 
smokinptoaheviofld prevalences between the a c t u a l and 
countcrbctual worlds Is free of fi , _ 

rjtsi xDjfifDiX) • rj(Si x,D)r£<Di >o 

. fi<S|(*,D)f^(Dix) - f*(3|*.D)^(Dix) . 

Then y and Its distribution need not play any role in 
the calculation of H(1)»H(Z>. Thai background and 
demographic characteristics that may be re lat ed to 
hea lth expenditures but do not affloex the difference In 
Joint prevalences of smoking and other heaJth-reiatod 
beftaviora betwee n tee aetual -and ooo n rt ri toual worlds 
can be ignored. - More simply, under Assumption 1. 
background characteristics that afreet neither smoking 
behavior nor other health-related behaviors .can be 
Ignored in the actual world medical expenditure (he., 
'relative risk} model. 

U follows that (X. replaced by <*, d) '-l ■ ~ * ■ 

effect on. the calculated excess- medical ic' -■*1r.-sfi-v -= .. - .■ 

H(t> - E(0> That la, under Assumptions 
3, Actors in (X Djahat may a®*t medical-■ — 

<ba- actual world relative risksj.bul do . ..’,. r . 

uaoMng prevalence can be Ignored In ate ' " 

TU» result Allows as a special cue of the 

SeettoadJ: w=us./lariodnt'sari.il rfitav.i£* < .* r>~'. ,r. - 





*’nunffepaSft tha 
led In (X, D) is 
ion 1 bold: 

m precisely, first 
XandD 
actual world sow! 
only through (x, 
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D) information 
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H(l) - H(Z). 

in (X, D) Under 


that NEED to 
re a* to make 
sot affect behavior 

ptions l. 2, 3, 
y) and J» “ (d, e), 
tret depends on 
these factors, Is 


What NEEDS to be Included In (X. D) Under 


s ■ ■ 


T ptlon* 1 and II r-v -j T V- 3 C * v ■ T ■ •' ’ t • • • - • . 

aa^^te^SoiSwpaitSnLi^^m ,*y*--**»H 

nowithe difference of smoking prevalences In the 

actual and CowdetAcsal worlds Is free of (y, e): . ; 
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IRON WORKERS LOCAL 17 LITIGATION 
SUPPLEMENTAL REPORT OF PROFESSOR DONALD B. RUBIN 

I|am a professor of statistics at Harvard University. I served as Chairman of Harvard’s 

SllT 

Department of Statistics for nine years, from 1985 to 1994. A copy of my most recently prepared 
curricutuftr vit was attached as Exhibit 1 to my November 6, 1998 report in this case. 

I identified the features that the data collection and analyses 
liy valid estimates of the excess health care costs incurred 
of the defendants* alleged misconduct either generally or 
ct on the mists* behavior. I incorporate that discussion 



■6,1998 
reliable and 
by the nlaiippfiunion trusts as 




because oft h^^ cct ofthataQegi 
by referencogteBIp. 

Briefly summadans^ the^^^d be two models: 1) a one-component or two-component 

fSqpIct of the defendants* alleged misconduct on the subsequent 
of individuals v4io~weh 

nion trusts; and,'^ f medical expenditure model that estimates the effect on the 
!A care expcadiiKPlffthose changes in smoking behavior estimated from the first 




or who would have been recipients of health care funded 


e essential features of these models are: 

"IL .. ■ - * • 
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3. Both models must focus on the population of interest. Here, that population consists 
of those individuals who were or would have been participants in the plaintiff union trusts' 
health care programs. 

4. Both models must consider distinct types of smoking behaviors that can lead to 

g it health-care expenditure outcomes. 

For those aspects of either model that rely on assumptions rather than data, the 
tions must be explicated and justified. It is critical, moreover, that each relevant 
don be capable of being individually assessed and altered. 


i The possibility 
I misconduct mu 


lity m&temhokers' < 
muifllsobe coral 


:ers’ other health-related behavior may be affected by the 
considered. 


7 ^|T he statistical 

variables must be madp^Slii i 
va ifabl gg in the groups t&gmgbi 
thb reli ability andugcegt 


^The hi 



ies employed in both models and in the studies upon which 
lie and statistically valid. For instance, missing data must be 
Lamer; adjustments for background and other confounding 
taking into account differences in the distribution of those 
•mpared; and sound statistical methods must be employed to 
ainty in the resulting estimates. 

diture model must consider distinct types of health care 
mtially influenced by smoking behaviors. 


9. ^pfHeahh care costs i 
del^pnu’ alleged mid 
insteanop, any increased^ 
misosmua cannot be attr! 
be teamed in counterfad 


m count 


taappsriginal report, I R Qtc&w ter alia, that neither Dr. Harris nor Df. Demerit obtain their 
estimates c^Sative risk (and, in the case of Dr. Harris, also his estimates of smoking prevalence) 
from a stu|y^ the participants in the pltintiff unioninlfta 1 1 fiirther’repbrtSd fhat‘anal>4ea of a 


goukl have been incurred in counterfactual worlds without the 
act mutt be accounted for when .estimating damages., For 
Pcare costs due" to smoking that o c c ur re d before any alleged 
|e to that alleged misconduct because those costs also would 
lorids. ‘ - ■ •• ! 


nationally . Icorfcse ntative^kantDle^ the^987^i^fdn^ r W^caP^cpeMBture^'Stinw' C*NMES"), 

idy- ti :trls jdauorvibm ❖Jiif.w* atta bfut noivamd rw^nto^ 

suggested that smokers land norumbk^^ivithiflT'JMES^enSn^^ of 

.(nwbtKKHtm ues*? u» j ct« <+? 

NMES, had characteristics that were so substantially different that attempts to adjjust Cor differences 
hsgsiia or is nadw gninragatf ^cniilo egaaoq am *u»ooa» o*u IMJ ataiwabofo mo# 

between smokers and mnsnbkerrttfihg < fegi i ££fian^in8dels t iure1c^^ ^3^ely^ft^i£ie. * 






, jvfrV'-j '■£;■ 
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In this report, I elaborate on this point and present additional analyses, also contained in 
electronic form on a diskette, which will be sent to counsel for the plaintiffs, demonstrating that the 
national population, as represented in NMES and the National Health Interview Survey (“NHIS”), 



from the corresponding union subpopulations. Because those differences are so 
obtained from a national population cannot reliably be used to estimate health 
smoking prevalence rates of participants in health insurance programs funded by 


£ REL IED UPON BY DR. HARRIS DO NOT 
DIFFERENCES BETWEEN SMOKERS AND 
RESPECT TO THE BACKGROUND CHARAC- 
CONFOUNDING FACTORS INCLUDED IN THE 


GRESSION 
LY ADJUST 
OKERS E 
CS AND O 

ssion mod: 



y ^ 

Thej*Btijes rriiediaptm by pttlM|n is frequently utilize regression models to attempt to adjust 
for differences in baofegH&fid c H ia aa n i istics and other confounding factors between smokers and 



on my anal; 
the union 



the national population, as represented in NMES and 

' • - ; •.- - . ' 

in NMES and as defined by Dr. Dement in NHIS; those 


regression ffgscs cannot relia ^s^rf ust even for those background characteristics and other 
confound i^&aors that were inlilplln the regression models. ~ :v -*■ 

' All e rfjj mc regressions rely upon certain linear-additive assumptions, and the consequences 
of malting th fee E nearity assumptions should be addressed. Because these consequences aid sensitive 
to the degr ^T^o vcrtap in the distributions of these background and other confounding variables 
between smoker* and nonsmokers, it is appropriate to use propensity score methods to examine the 

-r: :7- r.rr.<zn -rvil ‘kfirgi.fr r .rerr,r -j :•* »' r: rtr’C ikiv'-'S*- ritiVV' 

•overlap.Without such analyse*'one cannot have any confidence in the regressions’ ability to adjust 

‘j'v^rv-irfrrr ortr'io ytJtrrjb; ,r.v.-tuoir.v tt >. r. *rc nn y o/U'ro f i 

*;r. znirrMv ufT*" O mtifiPW " utfH mrr fc mnarwr: hewjssT *i ft'^ti-pfrirryb 
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reliably for differences between smokers and nonsmokers with respect to covariates (te., the 
background and other confounding variables). - 

A. Background on Adjusting for Covariates in Observational Studies 

the stat istical literature warns that regression analysis cannot reliably adjust for differences 


yramgrirail literature warns that regression analysis cannot reliably adjust for differences 
in covar ^q ^^ien there are substantial differences in the distribution of these covariates in the two 
groups. 

eg - hw 

Forewunple, William <|p2oeijran, who served on the Advisory Committee that wrote the 
1964 SurcecrifGeneral’s Rgpt foTpym te extensiveiy on methods for the analysis of observational 

/• \ s'-’ » y * v ^ .* _ * 


groups, 


who served on the Advisory Committee that wrote the 


studies,! 


Statistics i 


imanzed tn myJ 
1984, John Wife 


And in 


Narnia 


ces in parents’ in 
school incomes i 
1,000-56,000. * 
jeome of58,00< 
s are at or eva 
and Uses." Bio 


wrote: 



aster on his work on this topic m W.G. Cochran s Impact on 
• . 
g|w York). In 1957 Cochran wrote: 

SSw real differences among groups — the case in which adjustment 

i adjustments [i.e., regression adjustments] involve a greater or less 
Illustrate by an ex tr e m e case, supposethat we were adjusting for 
*yn a comparison of private and public school children, and that the 

ii from $10,000-512,000, while die public-school incomes ranged 
rovariance would adjust results so that they allegedly applied to a 
etii^ grtmp. although neither groijp hasany. observations in which 

S this level." Cochran, Wiliam G. "Analysis of Covariance: Its 
S, VoL. 13, pp. 261-281. . 


£ mm i , 


And in 




n 


. “Iftik^original x-distributions diverge widely; none ofthe methods [e,g. i regression adjustment] 
ca^bw trusted to remove all, or nearly all, the bias. This discussion brings out the importance 
g comparison ,groups. in which th(^jnhiat. differences aitu)ng the distributions of the 
ing variables are small.** 

ssrim bits bnuotg^ond nsdt'tb snobucroKb srir in {jbJ-jsvs k s.v :rw< ‘-..fi oi 

article: 

?• -.":i ■>' ">bnf?Xi»n ytianoqovq-aav Qi sishqoTuqa «i ii .rxsjior.ixrrr-i r.at itsv^-c' 

"With several x-variables, the common practice is to compare the marginal distribution! in the 
group* fore^.3-yariabk.SEparaefy a Th4llboviLWgUTP^ 
if the form of the regression of y on the x’s is unknown. Identity of the whole multi-variate 
distribution ia required for freedom - from; bias.** Cochran, William G. “The planning of 

V: ^ :-v 
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observational studies of human populations." The Journal of the Royal Statistical Society, A, 
Vol. 128, pp. 234-265. 

In particular, there are three basic distributional conditions that in general practice must 
neposiyobtain for regression adjustment (whether by ordinary linear regression, linear logistic 
regressionWlliear-log regression) to be trustworthy. If any of these conditions is not satisfied, the 
i the distributions of co variates in the two groups must be regarded as substantial, 

in many of the studies cited by Dr. Harris, is unreliable 



and reflrgftgffiT yipistmgnt, such 
and cann^PSPSjisted. These 
1 /^%pThe difference in 

piAf 

must be small (e.g., the means 
is benign in 

(b) the distributions ot"tne covi 
sample sii Sfi r i approximately 


i in me sense th at: (a ) th 
dfsmwmions ortnecov 



are: 

of the propensity scores in the two groups bong compared 
than half a standard deviation apart), unless the situation 
ions of the covariates in both groups are nearly symmetric, 
es in both groups have nearly the same variances, and (c) the 


- 2. QJtte ratio of the v ^jiiiey of the propensity score in the two groups must be dose to 

one (e.g., are far too extrcme| 

3. pPahe ratio of the of the residuals of the covariates after adjusting for the 

propensity np must be dose tfK|(.g l : l/2 or 2 are fcrtoo extreme). r 

Spe^Sftabulations and calculations relevant to these points can be found, for example, in 
Cochran an rfjp in (1973) “ControffingBias in Observational Studies: -A Review," &wi**j«rS6nes 
A, VoL 35,jP«rt4, pp. 417-446; Rubin (1973) "The Use of Matched Sampling and Regression 
Adjustments to Remove Bias in Observational Studies," Biometrics, 29, pp. 185-203; and Rubin 
(1979) “Using Multivariate Matched Sampling and Regreasion Adjustment to Control Bias In 


Observational Studies," Journal of 

£ tidirfx3 te atari beriocris ti noqeft 


. 74. pp. 318-328. 

\ '•pciwie*. rrT 
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In particular, Cochran and Rubin (1973, p. 426) state that “linear regression on random 
samples gives wildly erratic results . . ., sometimes markedly over-correcting or even (with B = 1/4 
for e*) greatly increasing the original bias [when the ratio of the variances is one-halfJ~ Table 3.2.2 

r \ 

in that a fride im plies that, when the ratio of the variances of any covariate is one half, regression can 
grossly Sverco rrect for bias or grossly undercorrect for bias. Relevant results from that table are 
summaru^hyTable l of my July 1998 Oklahoma Report, attached as Exhibit 1 to this report. These 
three gullllll and Table 1 in ^ffrcport also address regression adjustments on the logit or linear 

jr a*,- i, e-- ' -v 

U too, rtfajnjnear additive effects in the covariates (for discussion of this 

^ ■. . . ^ ^ , # -r 

ItfgipHauck, Oakes, Vandaele, and Weisberg (1980), Statistical 

Methods/^Comparative Siuckj^lpfan Wiley, New York, p. 164). 

B. |*^^ Prope|j f f f |' Se^W^alyses Comparing Smokers and Never Smokers In The 
^MKNntioi^rtopid^|mAs Represented By NMES and NHIS 

an indication ofhrwu^d iffcrcnt smokers and never smokers might be in the national 

v ‘ K v v> . 

population^fign conducted two^t^^isity score analyses using NMEji,as.discussed in my January 



1998 Repo^^vfinnesota 1 and iA“myTt|ly 1998 Report in Oklahoma. Those analyses, howcvcr. were 

unweighte^Snirror what the experts had. done. In Table 1 below and in the electronic 

iced in this ca^pP^npared^smokers and nonsmokers by conducting, .a propensity 

using NMES and its weights, thereby..-rep re s cnti ngiihnd nationai population 

the NMES design. :.Jn that janalysii. J usedjthC}3.2:covaristescadected .by the 

>? 7 qt^Mn, : Kiiis ; Okllbcnna report; • s^haS .qq .It JoV .A 

t/u ,‘V.- r*L cr ' ,ri4t*wi<«v*sa«fe?Q ci tstfl ovatru&as a'riamriuiLA 

rJ et;£ lounoO ol mornisoibA noistsnga# bus gnii 

,8£&41€ .to >T wsatOHtth,. 

The January T998 Minnesota Report is attached here as ! 



o 
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Table 1: Propensity Score Analyses For Smokers Venus 
Never Smokers In NMES and NHIS 



Smoki 

Grou 


Cu 

(NMES|u 1 .81 


Propensity Score 

\ Variance 

JBias Ratio 


Residuals Orthogonal to Propensity Score 
Variance Ratios in Range 

(0,1/2) (1/2,4/5) (4/5, 5/4) (5/4,2) (2, ~) 


Formerft^ ' 

(NMES^™^ .75 1.2^^®^ 


rv) 

F emales^^g „ 

(NHIS^ .62 - 41 15 14 * 61 30 15 

Current 

Males 

(NHIS) ^>9 .45 21 23 50 3 7 4 

Former Awjw ^ JKh 

. JseWoSSSS^ 

rCRUUu r 1 ^ 

(NHISJg^^ .50 .30 13 25 73 20 4 

Former ^5?^ 

Males - QJ ••■ — - .. 

(NHIS) Jjf*J.87 J0 ^_j U 3S 67 10 11 

I al^ass8©nducted four p rctPCT isity score analyses using the 1991 NHIS data, using sampling 

• , pn.nf——1J^ 

weights and#tf of the covanatfMi%ted by the plaintiffs experts, the Cambridge Team, in the 
Massachusdattigation , 3 separately by gender, as in the Cambridge Team’s analysis. Those results 

-v : ■ -.■•■, *...V- ‘ : - 

.:croh Si.-mPl»il| jeroitehfO v rsot-: r -■'• *v .-c:J .^cv v: -- 

eyjo at*' cJ ^ '■•r •• ••" ' • *'» ' " ' • *’•■ 


,;0 : r sr|; If*7 V^OTTUE F f r 1*5* i. 1 Jii-” '•■ ’-i"-? -i •- •-?*• 1 “ -■- 

vjigr^jqoro adJaiutpncoc7 .££■»«.- s.\ ;uttvsv; v*. r 

w ; <> fr| a^gt’ »‘rr rt jT s>g&riti.-ns3 rdr yi bnvotnrre' setdahaw art: V. 4-T u«u: ! 

<»dK> «moa y ata t fl eo boa .saldaiw aafJxqoMc Una ywsubrt; giritalsrs .sre-r ‘.■clizi’ity. 

2 *dt ?p.fto^tl^H«rtman;iiang/WNeWfi<rad,f*ni* ofSmklclng on 

Medicaid spending In Massachusetts 1970-1997 Methods,” June 15^998. J111 ***»*« 

: -,v ". . iC:- . ; • - 

■ . .71. 
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In all instances, the differences between smokers and nonsmokers nationally are substantial 
enough that regression models cannot reliably adjust for even the background characteristics and 
other confounding factors included in the regressions. 

the Enear regression methods used by the studies of the rational population cited by Dr. 
Harrisbe considered as having adjusted in a reliable way for differences between smokers and 
nonsm fflgCTM Better procedures would combine regression methods and propensity score 
subclalUralion, as suggeste<piftfw statistical literature. 


C. jf Propensity S 
Union Subset 




Comparing Smokers and Never Smokers In the 
S and NHIS 


and nonsmokers observed nationally appear to be even more 


The differences 

substam i^Tthe_ 

wrighti^w^pfresenteeNSm abMil Even if the experts for the plaintiff union trusts had examined a 
union an or relied uponiamdUd in the literature that examined a union population, they must 


union $ubsebnjftl|e NMES 3 and NHIS* data. These results using the sampling 

P™! 




m *^ CC re ^vr^ adjustments fi^wBercnccs between smokers and nonsmokers, on background 


f** , ft 

characterises and other confounding factors; linear regression adjustments are not reliable in this 


setting.. 




• - m •• • * •'* 


f • « : '1^ *-i* ■ li‘..t 

v 

»_ 

». it 


. -... i c .j. 

ically looked at those 

by a union in any of the____ 

in used the 32 variables in the Harrison Report in Oklahoma. Additional detail 
electronic materials accompanying my reports in this case. 


individuals who reported that they had private, 
‘the four rounds of interviews. To compute iKep 


no 


I examined in NHIS the same population that Dr. Dement used as a surrogate for the union 
population when he e s tim a te d smoking prevalence values. To wiptu the propensity icohm; 

I used 34 of the variables employed by the Cambridge Tea m's June Jn 

M assac h u sett s, e x c lu di n g industry and occupation varfahlay end _ 


reports in tty§gsse{ soul 


2£ 
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Table 2: Propensity Score Analytes For Smokers Versus 
Nonamoken In Union Subsets of NMES and NHIS 


Propensity Score 

Smoking : Variance 

Groin) i Bias Ratio 


f y f 


Residuals Orthogonal to Propensity Score 
Variance Ratios in Range 

(0, 1/2) (1/2,4/5) (4/S, S/4) (5/4, 2) (2, -) 


tQmh j&. , syN 


Forme^^iF 
(NME$||lM .69 

Curren^^*| 
(NHIS) Jf .86 


>y# 

Formet ^^^d , 
(NHIS]f^ .9 


L L [j 


NATIONAL p^UL, 

R$M*ATION. 

TT ' 

’ ’■ *< •* .• * ■ *■! 

Dr.Aiarris relies upon i 


FAIENCE OR RELATIVE RISK BASED ON THE 
|N ARE UNRELIABLE ESTIMATES FOR THE UNION 


Dr ^^ is relies upon of relative risk from the literature that were derived using 

national data for particd3STftarns, and he, moreover, estimates smoking prevalence values 

based on n || | j | jf a l data. 3 Thus, ^^^Sidtly assumes a) that the quantitative impact of smoking on 


health 


it is among 
thatsmokm 


union i 


is the: 


fie national population or for employees of selected finds as 


uals who participate in health care programs funded try union trust funds; and b) 
fence estimates using national data apply equally to the participants in the plaintiff 


Dr. Demem’s estimate of the relative mortality risk of smokers fbrjurticularjtfiseases likewise . 
is baaed on the American Cancer Society’s Cancer Prevention Survey n dataset, which ji not c 

2^o»dpapuUtionbrithe’union ^population 13 theTJmisSTStatei. 




vjti" • .v-AVv -." 


ACS gttaimrffc godyssqmoooc elshsttm oinirnaob rJ bsi 
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Despite Dr. Hamt* deposition testimony that “ in an ideal world, . .. one would really 
want to look specifically at the population at hand ... nowhere does Dr. Harris assess the validity 
of these assumptions. Harris Maryland Dep. at 66-61 (discussing the African-American population). 


I have |H~cpar^ d propensity score analyses comparing the union populations represented in the NMES 
and NH IS sam ples to the rest of the population as represented in those samples, using their sampling 
wdghttjl The results are in Table 3 using the 43 variables in the Harrison Report in Oklahoma for 

rig * 

NMESfijwPifting 40 of the co|#Hitea plus two indicators for current and former smoking from the 

ternm*. jr "% 

Cambridge Team's June 15 M^g husetts report for NHIS, excluding industry and occupation 
indicatckafigboth instances, tM^ta#ses revealed that there are substantial differences betweenthe 

i .* ' ■ * • 


I both instances, ■ 


raes revealed that there are substantial differences betweenthe 
(ion on the background characteristics and other confounding 


union population and the rest o&thcnation on the background characteristics and other confounding 

Wh _ TtT 

factors thajftHber plab^BSP expp*i^|itigation against these defendants believe are important The 
differences are so su M i fl &l th%gg$gates of relative expenditure risk ratios or smoking prevalence 
values basfc^pn national data car^^raiabty be applied to the union population nationally, much less 


to the 



i national data 


ants of the plai 


iably be applied to the union population nationally, much less 




union trusts. 
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is found in the electronic materials accomp any ing thitfttffSft.* 
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Table 3: Frapcnrity Scare Amlyw For Union Ven m y— -Union 
Fopalttiou «i Scpmntid Is NMES cad NEBS 
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— " Cs^i - 
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Bccjdttcb Orthogonal to Bropcatity Scare 
Variance Ratio* In Range 
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STATE OF OKLAHOMA TOBACCO LITIGATION 
REPORT OF PROFESSOR DONALD B. RUBIN 

a professor Of statistics it Harvard University. I served as Chairman of Harvard's 

Stati«ics.J$epartment for n^ne years. from 1985 to 1994. A copy of my most recently-prepared 

tae is attached report as Exhibit l. 

I h#re been asked t the estimates prepared by plaintiffs experts of health care 



the State of Okl 
this Report, l expr 



attributable to defendants* alleged misconduct. 

< broad opinions, each of which is explained more fully below: 


Rel iable^ and igggggdcally valid estimates of the health care expenditures, if any, 
by ^afd Stat e^af Oklahoma's Medicaid program as a result of defendants' alleged 
caCh$Lc^culatcd. _ 

pe— a 

fcZ!! The plaintifis’ ISP^s’ analyses essentially have none of the characteristics that they 
m$ist have to generate^^Stteally valid estimates of the Medicaid expenditures incurred by 
S^State, if any, tha^^^ caused by defendants’ alleged misconduct. Nor do plaintiffs* 
d 1 *’ analyses eflin^atejchably the Medicaid expenditures due to the existence of smoking. 

1*4 The damage fNSmes. confidence intervals, and any statistical significance claimed 
SSPmese damage estikaamt^rovided by Dr. Harrison are unreliable and statistically invalid. 

l bSw to addreJmA QUESTION 

Introduction 

T q!ad dress the question of what sums, if any, the State of Oklahoma's Medicaid program 
expend ccasa result of defendants' alleged misconduct, one must compare the health care costs that 
Oklahoma Medicaid actually incurred to what those costs would have been in a counterfactual world 
without the alleged misconduct To start, one has to specify precisely what the alleged misconduct 
was and when it occurred, as well as what would have occurred in theabse n enof ^ 

and when it would Have occurred. Without trtfba data on the effect oTdefcndahtsh 


i 





b:. 
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I 


misconduct on State-funded health care costs, two models must be used: one for the effect of the 
alleged misconduct on the subsequent smoking behavior of individuals who were or would have been 
recipients of \ledicaid in Oklahoma; and another for the effect on the State's Medicaid expenditures 


of thos 


to add 


ges in smoking behavior estimated from the first model. 


ere are several other essential features of the proper data collection and modeling approach 


questions pose 




lawsuit. For example: 


Both models most control for important background and other confounding variables 
so thjphey compare be ea^Bra and costs for tike individuals, that is, for matching individuals 
factual world (wnh^fendams' alleged misconduct) and in the counterfactual world 
pQfifef fthat alleged miil^j^ict). - 


2 K^ Both models h 
al^pcl^nisconduct oc< 

Both pottos mi 
who were orpaapd haj 


take into account the passage of time, beginning when the 
ind continuing through each subsequent, relevant year. 


u pp bcus on the population of interest. Here, that is those individuals 
iHtjJbgmh recipients of the State's Medicaid program. 1 


'' '■4% gp l For those aspe 
aaetitaptions must be 
, ' ,( as$9mption'be capable 

fj i 


i either model that rely on assumptions rather than data, the 
ated and justified. It is critical, moreover, that each such 
jig individually useaseid and altered. 



misconduct mi 


; The possibility ^that sm okers' other health-related behavior may be affected by the 


be considered. 1 


j The statistical todmigues employed in both models and in the studies upon which 
Pmodels rdymusf’Mrowble'ahd statistically valid.' For instance, missing data must be 
ued in an appropriate manner, adjustments for background and other confounding 
lea must "be made" after taking into account differences In the distribution of those 
es in the groups being compared; and sound statistical methods must be employed to 
thbYdlibSity arid^uncertainty-in the resulting estimates.^ u? -^n- *dw 




-5\^3i-rf;7:SjT« HeaWi'^cifc Cos^i -dufe"to smoking 'that Would have'occurred regardless *of the 

defendants' alleged wrongdoing would be excluded from the damage estimate. For instance, 
yUctsfrag .voivariad gnoiom z n*s mubni“j:rr r r ttzf 'ir erf* ie rrc'frr »■?? r.-.j; yc.T .irr'rvv" 

1 Throughout this lnep6rt,bthe^reference to State's Medicaid ' pr o g r am ' amplichlyxnchides the 

charity cane recipients and State employees included within plaintiff s damage calculations, 
slqmfe ;jHffiSbiifltbrt ioditatefrotherivise. Isfcom InofrnbJ r grfbes^ lo aieaj oriT 

V ; ^ misconduct, and an expanded model could include this feature. 

. ;> - : ./T,.--, a: • : T- . 

.• . - ,,'i.ri .., ■■■»■ ■ ".-i* * 

.. • • . 

" , t - v- _ 
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4 


any increased health care costs due to smoking that occurred before any alleged misconduct 
cannot be attributable to that alleged misconduct. 


B. Behavioral Model 

\ 

aCcuaider first the model of the effect of defendants' alleged misconduct on the smoking 
Higgle pertinent population. Suppose it were alleged that the defendants failed to disseminate 
e aontej ibout the health effects of cigarette smoking in 1965. I would first want to estimate the 


effect 


alleged failure 


the .recipleffs of State-fund 


consic 


world. 


slevant to healt 


^^y stimat yma^ o uld 
cessation and smoki apiiti tiati 


risks. S 


of the t 


i ^^£|ih c 


oldng behavior of the individuals who were or would have been 
Ith care. For each distinct type : pf smoking behavior that is 
:osts. I would “estimate" its prevalence in the counterfactual 


on data and expert opinion concerning the effect on smoking 
availability of additional information about .smoking's health 


model of smokii 


uals who were < 


Mor would clearly have to consider background characteristics 

* .. .r . ...v ir--. Xu 

ild have been recipients of State-funded health care;-adjust for 


those b a^jj’T ound characteristics add other confounding factors that alter the effect of the alleged 


-■ - V .-ft; £■& — iA 

■ populetioPwould .vary from 

* .* ^ 2 ^ »um 


miscondBllfdn smoking bchjiadtjH^ind consider how changes in smoking behavior in the pertinent 

’ v . *•-; : . . . .. . uzu*.:xu?;. s 

■ populetioPwould .vary from KmJrthrough the end of the damageperiod.;-.. 

;r . ^ u.r vuhrjc-i^Cia /«. n. bs»aa.yi.£ 

'* - |plnri.when the objective of an analysis is soicly.to .estimate aggregate quantities (i.e., not 

' . c.z ijam&fy rt.v.'v: ..c :e:u crvas e«jxaqj-ait-ei Rswiaricv. 

within subgroups such as -defined $by,year rof birth .and.sace)^ adjustments,' for,'"'background 

•, charactenstin (such as year of birth and race) and other confounding fhctorsXsuch as health-related 
.si'nane: v'3 siC'V„.*.- aii> tncti u- j. ... ?. :c bluow gniobgnonw begnlk ‘cmebnofrab. 

behaviors), which may alter the effect of the allied misconduct on smoking behavior, generally must 


mbcTiude fo^the iysiihinfccsti m ai c s and inferences to be.»ali<Ufl4»Biij^iietiti Jtmrtgirawllr’ <:y. rAZ ■ 
.xnpiisiustsp ogwnan rlinai^ nirtriw befrubn: csmcoiqms ssctff fenawnsiqio©* : "-'Vr 
The task of creating a behavioral model is.inot»asjdaumingi 


h Wb’&ficidwij'bTfSetdaf v»Hd‘yrcvdehe«l'dfdiffet^^bi4PCTdl^6Aa^rt^t^ sjf|jk'‘ \ W 

.eunsaijf £ixt> sbulini blyoa bbom i»bft*ox»n* ooa Joobnoa«(HiSj. ,1 ®t^ilS®f^ r '■■■,' 


tbnsoxs ns iwa^joubnosaw^SS 


r 3 , ^ 

■ jbes'i 
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subgroups of people can lead to reasonable estimates under explicit assumptions. That experts can 

attempt to make such assessments of modified behavior is illustrated, for example, by Washington 

•w" \ 

State llgmrfjf’s expert Jeffrey Harris in his reports, even though his assumptions are not explicit and 
the of his assertions is invalid. 

Health Expenditures Model 

JncS I had modeled fraeffect of the alleged misconduct on smoking behavior over time, I 


would 



model the eff< 


expei 

Th k..m odel of the 
expendiP^^vould |||||uve 
been r ecipien ts o^»fHi^fund 





that changed smoking behavior on the State's health care 

f the changed smoking behavior on the State’s health care 
rder characteristics of the individuals who were or would have 
h care. These characteristics would include year of birth, sex, 
ne mental and physical health, and other confounding factors, 

e 

• * • • r •* • *■ • • .t ^ 

t may be important predictors of how smoking behavior affects 
the StatiWkedicaid costs. NfHt as with the behavioral model, ideally each such characteristic 
would MMsured on each ir ^iwc|;a l before the moment in time when the alleged misconduct had 

an effect |||pihat characteristic for that individual. 

g .» ;eniiftsr > ,i&tMchneVr . . . . 

.tj-jall^feahh carp expenditure model also would have to take uno account the passage of ume 

e..~jp ■ ■* !.«• ... ....... r * 

. ; . . ambloma 

because #mP|ccumuUte in time. 

i ioi *.t.: ;• 

Such a model may have, some general simUarities to the Harrison Report's medical 

i_T*»y 0.« 101 Htj\’ vm?s w : : • •, 



•fir:. ;• 


consideration of the passage of rime. 

iasfejigieypf ’enubateb at sub <fto»«tt»ditod ru etwtwi adflo sucaoaverfj .oltp.-iim aki? hi 

.OWJ ni-ti ilntrfj Trt'Sf nJ jnsraSibti iaubhribro aid? -roY jDubrwotim 


«■ Vi‘ T >* ’ 
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D. Illustrative Examples 

The essential characteristics of the models necessary to address reliably the Medicaid 
expen^turek incurred by the State on account of the defendants' alleged wrongful conduct can be 
illustrated by examples. These examples are meant to illustrate specific issues; they are not meant to 
su gg e&4hat the models need to operate at this level of detail. 

1. Examt 


backg 





Hpll^f the essentia ^^t^ res of the modeling required here — the need to consider 
nd other confo unding characteristics and the passage of time — are illustrated by a 
simple example. In this exa mpt^w e consider two worlds, one the factual world as it exists and 
another caun mtfact uaJ^ orl dLv^^ the defendants' alleged misconduct starting in 1965.' We 

e State in both worlds, which obviates the need to control for 
onfounding factors. We then see how this individual's health 
orids. 


consid^^?LmeindUadual 


back; 


care cos 


:grluteL|pi&r«c'erislic3 and 
in time in the] 




Factual Wori 
Defendants 

19SS o Starts smoking 



Id Information 


Counterfactual World: 

Defendants Disseminate Information 


'Bom' 


.vao* ■- :*r ./ 


* .0 if. iTi« 


. 4965,... 

• - 4 .. a i. v V i 


, _ Defendants withhold ... . t 

infontiitidA on health risks - ;v 


1970 


Continues smoking 
Still smoking 


. - Starts smoking .. . ■. 

Defendants disseminate 
infomtarion on health risks - quits 
smoking. 

.r.rrr:* m ejffcwturrnr rrsov oeuGOsa 

Quit for S yean 


1985 Quits smoking Quit for^O yean 

**** W?^Stn q;rT;iM? ?r3r.qrpy .«rw ,r>or»Q^^ fd?»i5’^n^ fac>m tntn&oq?? 


In this example; rite measure of the increase in health care costs du* to defendants* 
misconduct for this individual is different in 1970 than it is: ui 1990^:Spedficafly: 

• l *_■* ‘ v vjV. ■ > 
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j 


n 


In 1970, the measure of damages is a comparison between a thirty-five-y ear-old who 
has smoked continuously for 15 years in the factual world versus that same thirry-five- 
year-old who would have smoked for ten years but then been abstinent for five years 
in the counterfactual world. 


^ggg# In 1990, the measure of damages for this same person is a comparison between a 

H fifty-five-year-old who smoked for 30 yean and has been abstinent for five years in 

the factual world versus that same fifty-five-year-old who would have smoked for 
j only ten years and quit a quarter century ago in the counterfactual world. 


a viuj iwis j vet a wtu 

j^^Sre of any mcre^^N^al 1 


ith care costs to the State due to the allied misconduct by 


deferidantsSlalnly will be di 


in different years, requiring a iongitudinal anajysis commencing 

am ’ I . . • ..1 f?‘ ' 


of the alleged 


relevancy ear 


an^year. 

MtQ. notio^ijat. i ^to& idividual's smoking behavior were unaffected by the alleged 

j#l& : 

misconduct and,<§lp®fpre tfe^ iq^ on would have the same smoking history in the counterfactual 
world i ^^ Sie factual world, Bl^Piould be no effect on health care costs of the alleged misconduct. 



uct, or its first effect, and continuing through each subsequent. 


. notit 


inext, more co 


pie Two 


, example illustrates additional essential characteristics of the 


prop 


confoi 



collection and 



g approach beyond the need to consider background and other 

' -s>. * .*•». 


characteristics and to perform the analysis over time. First, it highlights the heed to 


focus oi fkhe. States* Medicaid population in bath the factual world and in the counterfactual world. 

and tha Onlsc individuals may not be the same in the factual world as they would be in the 

KwwnsrJ:* tr wyWw hd on sv asaif) 3 szt . > show ij!n*oc'r»‘m*oa ffi 

counterfactual world. In pa r t i cular, a person may not be in the Medicaid population in the factual 

£fuxom? K ncuaasso an? ^annroc n: of* o* .j-uv .pa»om« *■: nfwfJttzv t js>- 


Mitrtar&rs* 


ii lit tSlMMiiW ^i'I: 






ioQ*JlVu/*«i < 


i' alleged. 


-*.\V 4 ; »*v\*‘* ’•‘■‘ o • ,* • .• •. '-'y. . v 4 '* - .. 

smokers' other Health-reUtedbehavtors (e.g., overeating or alcohol consumption) might be different 
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in a world without the alleged misconduct. And. finally, the example illustrates that there is modeiing 


uncertainty about what would have happened in a world without the alleged misconduct, and so 
assumpWons tmd bases for assertions must be explicated. 


a World With 
Misconduct 
' in 196S 

fiPAs family history 


Counterfactual World 1 




lom; has family history of 
eart disease * 


.Sam lmoking heavily; J 

Dramnints withhold "ii^|pbefendants dissemii 
information on health risksttll^nformaucn on heali 
continues smoking L Jrisks - quits smokinj 

Stil rjmjp ong heau&r; uJ^^ tODS smoking 

slia ntiv lt venvefa^; slightly overweight 

No change ^ pWS^'Jo change 



I tart 33 mo king heavily: 
omul diet 


Defendants disseminate 
nformaticn on health 
isks - quits smoking 


No change 


s lung cancer 


. pw 



o change 


(heart attack and 
ass surgery 


Counterfactual World 2 


Bom; has family history of 
heart disease ' 

.Stans smoking heavily:•„ 
normal diet 

Defendants disseminate 
information on health 
risks - quits smoking s - 

> 

Stops smoking; overeats 
heavily 

Becomes severely overweight 
with high cholesterol 

Has heart attack and dies 




second heart attack , 
enters nursing home 


’ Still in nursing home ' 


Dies 

... L M.r v ."r.'TT . • •-. :\.y ... : ...... .>• *}$>& 

sh? rtt 96 hlfiogopib as throw la^iaur *ti3 ni on* s6 'or vx.*»; t*aiffchoi*‘-w crorif lOfte-cne. 

In e ountb r ia ctual world I. without the alleged misconduct, there are no behavioral changes 

; - 1 ': 15 - nniiakmrm bsenim-jr^ am jcv »■ r.i \r. 

other than the cessation of smoking. In counterfactual world 2, in contrast, the cessation of smoking 

.*“•*'* ItBtiUA W non?*o nidi’lHHrvi33Jl’j 5 hoilih^ctre c'6?u prior"to 1 ron?? 1 by 

■ . ........ • _•> -wt u.. w «*' 


■:) 


fcortt?lgSW^FtShfiFby -• - 


inaw8fc\*d MgW(nodqmuinM JoHooic idgniiaauvo ,8,e)rK>h«»^ 
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world 1. Instead of comparing total costs through the end of the damages period, one could have 
compared costs up to some event in time, such as death m any world. In that event, there would be 
no diffferenbe in Medicaid expenditures incurred by the Sure in any of the worlds described above, 
again ulimung no medical costs before age 60 were borne by the State. 

^wlontinue this example, suppose that, in the factual world, the individual never received any 
Med^^^nefits, as we hav ^aiurn ed' above, so that in the factual world, this individual would cost 
the SfcUM^gguhing in health cue costs. But now. suppose that, in counterfactuaj world 1: this same 

perso aaanj rtt his savings by became eligible for Medicaid in 1997. one year after he entered 

ji§§§yj|l^ 

the nursing no me. In this wo|^p|.giout the alleged misconduct, the State would have incurred'tens 
of thoupiiifejgif dollars in mJH^N|ome costs for this individual. 

.ftllp clcar t aatllh c MiSH status of this individual in these alternative worlds significantly 

td by the State. This shows why the analysis must focus on and 
atlon in both the factual world and in the counterfactual world. 
Another counterfactual world, not displayed above, in which this 
iected by the alleged misconduct, whieh automatically means 
misconducton States' health care expenditures. * 



nalystsofcosts 


tate’s Medicaid 


of course, there 



O' 

individtp^smoking behavt 

-PP} - , .. j 

there papfmbe no effect of t 

P§%ise these alternative worlds are counterfactual. there is no direct evidence to estimate 
how likel y ea ch would have boot' in the absencelsf the alleged misconduct'.’ Instead;'evidence from 
id uii Qb gervSd' Ins&e'ii^ s 1 •'•••itjn* 6 g , couoled ujtflliuiinipddni t 6 J fes'ttm*Ve tHe likelihood of 

i"- • * "*'■ .. , 0 . 

t* ' i.:!i«.-)v;^/t>fTO04ir;'to ar|t*n . ..t ..... 

Finally, from a wantflau perspective, such an analysis u each Counterfactual world need not 
be 2if*woi^3litl9X r bSert*nx5picftf , b£ i OldiRbma’s Medicaid 


ant 


Junw 

lystsla 


nd 


8 


I .“--st iKixn'Fl 
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characteristics and confounding factors discussed previously. The subgroup specific results would 


then be aggregated across all the subgroups to create an estimate of the State's total Medicaid costs 
due to the claimed misconduct. 

C i 

3 Comments On The Task 

pllwfng inferences from observationai data in this way requires care. Ideally the effort 


would 1 


the input of a tekn^of 


framewi 


alleged 


behtagguaJ.j 


having knowledge about medical expenditure models 


models. Nevertheless, there is a straightforward scientifically and statistically valid 

b * f 0 ’' _ y ' ’ ’ * '* * ' h -' 1 * - ■ ■ L- ir*l» 



drawing causal 


lixlI 


forms. 


wsv 


iemific, the undj 


ay, thj 


ces about any medical expenditures due to the defendants' 


sumptions must be explicated in detailed and disaggregated 
be assessed individually for plausibility and can be altered 


individi 


bundled toi 


Hflr 


allow 


r, the resultant i 


|he consequences on answers. When all the assumptions are 
^may be little more than a subjective assumption of the answer 


Statistically, each instarye oTaefendants' alleged misconduct (specified by its character and 


its uming^BBl be viewed as d< 

ps# 

exampty ppas^ factor could cor 


4 level of a factor in a hypothetical factorial experiment. For 


to the defendants’ alleged failure to disseminate adequately 


info rmati^gy arding the health risks of smoking, and levels of this factor could correspond to the 

— kinds of i^fonnation and their dates of dissemination. A second factor could be the alleged failure 

puli’* •♦«*-*• -viz ... jji:.. .... k.. ;■> j<,' u.-.ir -■ : j'.m vi5*;, wpfi. 

* r of. defenc^.tp rruricet ^safeg;^gareu«.. .The cornbitgdons i of^^d^lwwUevels of d^e to ots 
would define the defendants’ conduct in a counter-factual world without specific acts of alleged 
< misconduct Issuesof additive versus synetgUtie effects of the alleged acts or misconduct would be 

’ fcT ..- ■■ - <•»'*' ..i .i?v. Si-.--i- tl ..4 "i . 


bialddfessed.fey^tsd^tsa^^ 

i * TT iV , " 7 .. ■>- • r : 

wiiiWBMitnM Wi. nMtiit m m imfUlp {T** g 


- - cn 


__ ’- Vi. <o 

Marwi jni«i liTVrrrt^v-rr.-rv.wr -*,wrr;?vr . k j* 

[BEST PylAGE-^y-^;-o 


://legacy.library.ucsf.e®c(ticKpot|6^a0jQ)ipd^.industrydocuments. ucsf.edu/docs/xygl0001 




of (he costs across the specific, individual alleged acts of misconduct. This framework allows the 
explication and disentangling of assumptions needed to address the question of the effect on the 
State' care costs due to defendants’ alleged misconduct. 



H. ^THE PLAINTIFFS’ REPORTS* DO NOT ADDRESS THE QUESTION OF THE 
k MMjZGED MIS CON DUCT OF THE DEFENDANTS; INSTEAD THEY PURPORT 
Lto IsTIMATE THE HEALTH CARE COSTS OF THE EXISTENCE OF SMOKING 



esse a 
incurri 


qdeis and dat a^tpbts cs in plaintiffs* reports, and in particular the Harrison Report, 
itplIlf^Sii^ye none of the p^gr ^teristics that are necessary to esrimatethe health care costs 
Idahoma becatfteffidefsndant*^alleged misconduct. Plaintiffs' statistical experts* 
analyses simply do not addrespUp^hcre the effect of any alleged misconduct by the defendants, as 
Harrison^ 

In fact^J 

■ P psa»i 

isconduct on smoking behavior, plaintiffs' models give no 


a , 

th respect to his own analyses (e.£..Hamson Dep. at 253-S5). 

- 

loes nd^^ln exig^a rudimentary behavioral component in plaintiffs' models to reflect 


the effee p^ defendants’ all 

conridera^bri-whatsoever to d 
•**;.«?» « » 
analyses §t th^ Harrison report 


care cos 


t|j|i 


ibutable to tmo 


from any appli 



’ alleged misconduct Thus, for example, the models and data 

!• J!VT* * - e' . . . * , * , T '■ . .. 

o effort to exclude from the computation any increased health 

t was unaffected by defendants’ alleged misconduct. 

\r- in- 

gal considerations, the question of the effect of defendants' 

..nttin&uo- v h?:.- r-i* tv- :h; \' *rv . 1 

alleged rr ascoh duct on the health care expenditures of the plaintiffs, in principle, can be addressed by 

stausucapsHfyses of appropriate data. Any statistical analysis that can address this question must 

distmguis^^gpen smoking behavior aSectfi? aruT unSccted*^t^*i^^ account 

V- cr iWX-SKF OT 31 A* VTX&Z3LS **-ffS*XS IT 

QT rK7*m* SLR/. T/JEfT *3I>AM.*<? 3KT C'TF.UAT* 0^0?*'. 


a: :'™$i.. > t My t^pjnjeats tq »is Report posy*?r^; jtadcficfcncia fauhe pl a irn iffsi i e xp cna^'analyses are 
illustrated by specific examples using the Harrison Report’s estimate of the smoking’ 

applicable to 'Harrison's estimates of the smokm^attributable health earn expenditures of 

presentedTntheMax Report, *nd to tneranalyses of Mr. Roberts. ■ - -Hf* 

•*? ■10 : • 


w. riVu . 
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the passage of time, adjust reliably for differences in background characteristics and -ocher 
confounding factors, between those exposed to cigarette smoke and those never exposed to cigarette 
smc^^f ^ us on the relevant population, explicate assumptions, use valid statistical methods, and 
account for the timing and nature of defendants’ alleged misconduct (e.g ., exclude anv increased 
health-care.costs due to smoking that occurred before any alleged misconduct by defendants). But 
the models and da^^^ses essentially have none of these critical characteristics. In the 

«nd.te^ - «. >o «ddre» «*** *** ^rconduc. - - 

ports consist only of expenditure models. They simply attempt 




plaintiffs’ e; 



care expenditures, over a number of years, between individuals 
iduals never exposed to cigarette smoke of similar background 


to estimate the difference ir fh 
exposed ? ta ; pgarctt gssati3 kc 
characteristics (suc bsaasa cc. g^^yrace) and other confounding factors (such as education, ineome, 
diet) regnpless of when or ^6| j y H>at exposure occurred. The medical expenditure model in the 

JTa p pia rs to be 


.t :• r- - 

Harriso 


jort, for instanc 




an anempt to address a causal question different from 


one irw||p||g alleged miscocdu gt^ What health care expenditures were incurred by plaintiffs as a 

resul u^^w existence of cigifipf^moking. regardless of defendants' alleged wrongdoing? 

1 1ryr T?n 'a ' “»• "* ' ■ 

result of attempting to address this other question, the results in the Harrison Report, 

AiOJtd rao ,elq»nri* r>. f ; --i - 


even if SB of . the errors in their underlying analyses were eliminated, would still overstate the health 

Jturr; flou&#pWR; no jail) aifvifcir. yriA .ta ■ • t - w~:>-o * 

i to defendants' alleged misconduct. , riV ■■ 

Mira ilCii \<j IritT.rglZZlUJ l-Tm i*SiJ‘SXi tV-'i '.'*C J-rt-'W 


care ex^ ^ty res due __ . 

muoaos ojJfiwfiKf.taunnaaairii 


HL 


PLAINTIFFS’ EXPERT REPORTS FAIL TO SHOW THAT THE EXISTENCE OF 
SMOKING CAUSED THE DAMAGES THAT ARE ESTIMATED 


awl 


e’noqaX nosriiaH »rt» raiqnwo ofihwqa vd 

ares' which no 



| T "”'7 . r - 7*. 
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cigarette smoking existed. 4 To be reliable and valid, such an estimate must possess many of the same 


characteristics that I described above in section I of this report. But plaintiffs' analyses do 
not thdse characteristics, and they consequently fail to estimate reliably the health care costs 


incurre 


laintifts because of the very existence of smoking. For example: 

t Harrison's medical expenditure model does not adequately control for background 
g and other confounding variables, an issue addressed in more detail in Section HI. A. 




Harrison's modi 
addressed in ml 

Harrison's ex pi 
population, doe; 
to be generally 
in more detail it 


t in iiiwib 

4- ^LI tiarrisoagiri 
P aniss ip ^ 

^ | ^ | Secttdnfm 


ut w blt 1 

Secttc^ni.C.j 

Harrison’s expe 
related behavit^ 
existence or aba 


penditure model does not address the passage of time, an issue 

r l in Section ULQ. f _ 

model, although it purports to focus on a national Medicaid 
i|to so correctly, and fails even to adjust these invalid estimates 
ulriate to the Oklahoma Medicaid population, issues addressed 
tmn III C. and Section III.F. 

gd expenditure model does not explicate necessary assumptions, 
tsed in Section OLD. Moreover, it does not insure that the data 
the analyses' implicit assumptions, an issue farther discussed in 
Sand I ILF. 


:e model does not consider the possibility that other health* 
t.,’ overeating! illicit drug behavior) may be affected by the 
f the alleged misconduct See Section m.E. 

}' " ‘ . 

i identified in points 1*5 above, the statistical methods used in 

model contain errors and areriot statistically valid: missing 
y handled;' standard emusand confidence intervals are 

- Vi: e_L. 1 * •_v_I_ 


% exi stence or an ynee o f the alleged misconduct see section iu.fi. 

6 . In addition to pr#Pf« identified in points 1*5 above, the statistical methods used in 

PM# Harrison's 
pi|s«|daia are inapp 

%#incorrec^calcuH^^kiul SAFs.(smpknig attributable traownsj are. inappropriately 
^^ ariculitrd’ TTk-ut HLF. 

' >>£, IW' >--&V ywnjrS arii *Vi*; tjt-\ sr- 

A-faStm ■ t! r. 

i.-robffv»*atio vns yjui <i’jerp isrimn riguorfri* .quart* fwue* ■;> 0-.--.fr jo 
| lgjgji|y^-.*' f'lntfoductioo^. c -j ■' '.- vs i neon n»vs> -so a =*.» , 

..ar-ioff lio .«■ r :ioV “inUbiWajtrisVf 11 * ,. 

The Harrison Report employs regressions to adjust for background and other confounding 

characteristics of smokers and nonsmokers, such as their race, income, body mass index, , and 

.icu?eiaoT 4tve)ebOili>m»d»)oMion .^bkivvegifiribesfoiiodriraib-xUriijjho 
■wri-werifr iptT—.ra iri- J rii .Us ytasp to ,H» #d mb \ Jnamie«{, l 'n 

• 'b'r’eourerias would 

di^poJiiHKdmTljwauu^m^nMiteiwaregsjahrtiHinaiswiMai^.eisueno 
excise taxes: •* /WV-r.r-: 

. •-.* •* ■*"»Vi ^ . , ‘ vfc‘ •• -^-jj 
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educational level. These regressions rely upon certain linear additive assumptions, and the 
consequences of making those linearity assumptions should be addressed. Because these 
conseq^ g^^ are sensitive to the degree of overlap in the distributions of these background and other 
confou nding v ariables between smokers and nonsmokers, it is appropriate to use propensity score 
mcthodfjjtojpamjnc the overlap. Without such analyses, one cannot have any confidence in the 
HarriscfiPiiSifort's regressioni^S^ty to adjust reliably for differences between smokers and 
Jkernpth respect to ct ^i^ka^ s (i.e., the*background arid other confounding variables). 
nwfte tistical litera.tur ewaras that regression analysis cannot reliably adjust for differences 
covariates when them are sullxlnml differences in the distribution of these covariares in the two 


nonsmc 


in 


groups 


■ 


Mn 


^w^wampht^^Jiam ^^Cochrin. who served on the Advisory Committee that wrote the 

p@Mlt ■ 

1964 sl flg t General’s Reo tefeswato te extensively on methods for the analysis of observational 

studies, dslsu mmari-zed in my&fl^ipter on his work on this topic iri W.G Cochran 's Impact on 

\t ts.'3W; sc- yam vnm -..:- ....3! ••.» 

5to/m/cj^A). ’l984, John wtey^pw York). In 1957 Cochran wrote: 

ri the x-varii 

eht Is'need 

.r.: 



|W real differences among groups — the case in which 
'^covariance adjustments [Le., regression adjustments] 
„ ' u riaaki e a greater oir I s^K^ipiee of extrapolation. > To illustrate by an extreme case. 

"' r}< ' J fdr differences in parents' Income in 4 a comparison 

of^jgidate Ind'pubiic School children^ and that the private-school incomes ranged 
fi #n S 10.OOOfSl2.OQO. while the public-school incomes ranged from 54,000-56,000. 
T^^variiH<3 Woukfadjustresults so that they allegedlyapplied to^ mean income 
in each group, although neither group has any observations in which 
irsilPflil are at or even near this level." Cochr5Af*WtUiam G. "Analysis of 

Covariance: Its Nature end .Uses." Biometrics, Vol. 13, pp. 261-281. 

jr»,anucilfto>.»(t»0,Onetecwoj4Jjiw»a vn c eiosac^gin no?*# n : ; v> 

iter' 


h;ci... 


,Yh04«»7i£Ofr. ,a 3 si rtti’J cs n> : : 

"If the original x-distribudons diverge widely, none of the methods [e.g. regression 
adjustment] can be trusted to remove all, or nearly all, the-blasr—This-discussion- 

mxa> oriole* . 

. • i *'' -.V ' • ;- 1 . ." *■ . • * - . - • •/ '.a..;.'* . . ■ 

-■ *■.'■ : - - -: •": *•'’ ••• - : v. •- ■- 
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And in the same article: 


sun 


‘With several x-variables. the common practice is to compare the marginal 
distributions in the two groups for each x-variable separately. The above argument 
Vs it dear, however, that if the form of the regression of y on the x’s is unknown, 

-gjftity of the whole multi-variate distribution is required for freedom from bias ” 
r Jochran, William G. “The planning of observational studies of human populations." 

Journal of the Royal Statistical Society. A. Vol. 128. pp. 234-265. 

t { . - 

particular, there are three basic distributional conditions that in general practice must 

.... . _ . 

ly obtain for reg^ramvadjurtment (whether by ordinary linear regression, linear logistic 



regression of ! linear-log regret 


k - w- : • * »v« ■ *• . i aw a- ■ -a - »■ v fwJ »"•' * « Y 4 ^4 ^ t^i 

|o be trustworthy. If any of these conditions is not satisfied, the 
differfegis^etween the distri^^Pof covariates in the two groups must be regarded as substantial, 
and re cession adjustment. performed in the Harrison Report, is unreliable and cannot be 

trusted|g^ese cordfta^ns a^ 

Vfof means of the propensity scores in the two groups being compared 
(e.g. t the mean* V‘~ $$^be less than half a standard deviation apart), unless the situation 
ie sense that: • (aflS*:dismbutions of the covariates in both groups are nearly symmetric, 
buttons of the ^^^ptes in both groupshave nearly the same variances, and (c) the 
are approximatlte^hdsame.' r *e-v.;vos £C 

The ratio oftnPmianees of the propensity score in the two groups must be close to 
or 2ure far too extreme). :v ro ..«»T .v.-'-aievr:;. -• 0 

r:r>? ^ -* 'i The ratio of the vanances hf the residuals oftheicoYari4tea;afterndjusting'iforithe 

. propensity scire must be dose to one (e.g^l/2.or^2_are far too extreme).!) brncm w csrvbsr...- 

Specific tabulations and calculations relevant to these points can be found, for.-example, in 
Cochfan and Rubin (1973) “Controlling Bias In Observational Studies: A Review," Sankhya, Series 



A. Vol. 35, Psn 4, pp. 417-446; Rubin (1973) "The Use of 

.—‘^^'ytuqgw.gnifnj^ ^ a as 

,£j'- "■ '' : : 

M4? . ^ 
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(1979) "Using Multivariate Matched Sampling and Regression Adjustment to Control Bias in 
Observational Studies," Journal of the American Statistical Association, 74, p p. 318-328. 

Fin particular, Cochran and Rubin (1973, p. 426) state that "linear regression on random 
samplesftvw wildly erratic results . . sometimes markedly overcorrecting or even (with B * 1/4 
for e*)|greauy increasing the original bias [when the ratio of the variances is one-half]”. Table 3.2.2 
in that^^^implies that, wh 



grosslpe»ws^)rrect for bias 
summ^^ggFhere as Table 1 
adjustments on the logit or li 


co van at 
Weisb 


.^^S^liscusston of 

'^ 12 ^ 0 ), sSSfcal 


f tssmapl 2. 


:.:£.--':rvvI 



io of the variances of any covariate is one half, regression can 
sly undercorrect for bias; Relevant results from that table are 

C‘"“ ■" • ‘ ‘ ■ o c.ohtewtpei 

« three guidelines and Table 1 address Harrison's regression 

■■■■'- -- .... fTrs .‘WjW 

g scale because they too rely on linear additive effects in the 


Applii 
Sarnpl 

spenstty scores between "smokers”~*nd “nonsmokers” in the 

■wwunrnni 

database use d in thei Hamson Report. Alt six propensity scores were estimated by 


tt. see e g., Anderson, Auquier, Hauck, Oakes. Vandaeie, and 

r for Comparative Studies, John Wiley, New York, p. 164). 

] to Harrison Report’s Analyses on the NMES Medicaid 
- ’ Samplil®^ 

;I Opiated six alte 

tj 

NMES MSg ^sid database usedJnthgfHs 

logistic rej^^ on using the 32 jjBflgp^tes included in the Harrison regressions (see Display 1 for list 
' of variabl^^The'first set of proflllity scores used the variable SMNOW with 1 « current smoker 
vs. 0 — noWlrTent smoker. The second set of propensity scores used the variable SMOKED with 
s*|Wi fonh^^courrait-smbkerAnd 0 «% never smoker.' The third through sixth set'of propensity score 

]CL ■ • . ' 

analyses Compared those'with eo exposure to"cigarette smoke to those with different lends of 
ni eXpOSUfelP) .btmrit md rum tmioc *!**■:• r; v*vin-.'r«r- 

",werv*JIA :e*ibu»£ lsnoiirmi^&nl asifi goUftnutoO' (Ere -) rsd-wi birth «STfk»63 

' ■ ass*-, -«« compirtVdUkeu«4S^^->i7 


computer diskeu« 4 &^ ; . r , : 

vv«*ssBr?•v-: -nsw** gABiwa vtggfe *.-.*•: ' 


r - 


\ 


c 
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All propensity scores (which are estimated probabilities) were then transformed to the logit 
scale so that they were linear in the original covariates. This transformation was done for three 
reasons, "“first, relative to the raw propensity (probability) scale, the linear propensity (logit 
probiplff) scale is more relevant for assessing the efficacy of linear modeling adjustments (including 
thoi^ptwPon logistic regression and linear-log models). Second, the linear propensity scores tend 
to pr jjiilyn orc benign distn bijttaafl s with more similar variances and more symmetry, because they 
are averages of thfnspgptal covariate values. And third, the linear propensity scores are 




of (linear) propensity scores: (1) the difference in the means of 


mor^^^^ly related to resb ||s igjj the literature on adjustments for covariates based on linearity 
assumptions. 

the j^fp^ity scc ^^jb etw ^l stp okers and nonsmokers; (2) the ratio of the variances of the 
propJl ^^ ores for smoked|pd sipnsmokers; and (3) for each of the 32 covariates, the ratio of the 
variancC residuals of thsj|^^*nsity scores for smokers to the variance of the residuals for 
(i.e., the resid ^s^a^ er adjusting for the .propensity scores). The results of those 
are found in Tab|ii|P^ / ■- .v .. - :i.» - -.'v . 

Spuing these rcsuRle benchmarks for reliability in the statistical literature, it is dear 
n regression model cannot be aid to adjust reliably wen for those covariates included 

SftRe propensity 





^’uvfcis ^bcusbnthe 'dirnent/nOt'aj 

'sedfei^ i&thS&^&ftsHo $f^va&ffiis is 


close to one, the sample sizes are not nearly equal Now conside r the res i duals of the 32 covaria tes 

1 covariates 


and the ratios of their variances smong smokers and nonsmokenL* There are < 
to toiiti ert: nom ads ^tsbdrsmH ^eoetaaib ew. uutu<09erirlo. 
cn ouudrrtEib adr naawsad eanapTri b edtssnal eflb ,1 mo*wMbel 

The benchmark values given inTsble I are accurate for the ffifR^^e^wneans mtdttfiftq-ratio. 
of variances of the propensity scorpj They sre hoi H aoww^ffehthq. res i d u aflfcof the? 

■ v ': ‘ : (oomfitued...) 

flBBST P‘1 Tg, 
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with ratios greater than 2, and there are ten ratios outside the range from 4/5 to 5/4. This situation 
is not benign. 


Q consider the second set of propensity scores, for current or former smokers (ever 
rsus never smokers. Once again, the means of the propensity scores are nearly a full 
stan fiation apart, but now the ratio of variance is only .SI, although with more equal sample 

jver, the ratio of frt qanq es of the residuals of the original 32 covariates has four beyond 

•- ... biW 

2 anc 



• V »■*. ; ' ?.‘?v 




outside the range of 4/5 to 5/4. This is not a benign situation either. 

*• j^Thfifethird through uxt^sis^eU similar stories based on the results presented in Table 2. All 
compPn®5r%roup* are close ffopltc standard deviation apan on their propensity scores, with quite 
a few c^tfiias^variance ratios ^^wa^ cularlv striking is the comparison of the unexposed and current 
grou ps, fi^ h are thanlfWi^tandard deviation apan. 

Thus, in tfife^ils disH&fy&i in Table 2. the linear regression methods used by the Harrison 
irt wTOot be considered - reliable. Better procedures would combine regression methods and 

suggested by the statistical literature. 

fabdtt mmary. Harriso blfc^afe ressions do not reliably adjust for differences in background 
chara cteristic s and other conrofipdmg Actors between “smokers" and “nonsmokers" in the NMES 

.njjy HM 

•.•-Medicai Aaa mplelorlgtj^titf n.- >~. ' .v *r~ ~ ,s a.* - . *:r ■:-*?. • - .. 

; rh^ffaiTisqnrRcport Fails to Consider the Passage of Time - .>■ ., f ■ 

tsjsriavos Kt aifcl'to flfaufbnn e As lehunoo wv*. .’.scp* vfrsw? :sn ws eya-s ::*> rur^m, v.‘ r-:c^ 

wiAurvw :^*tsDtonu»oo fens redone yoo u eeonshsv r*d? ip toia; erf? fc.ne 

saipme of the covariates are discrete. Nevertheless, the more the ratios of 
£ residuals differ from I, the larger the di fference between the distributio ns 
of the r^dt( IliM ^rr>6ki^» % and ; nonsmokert. 0 ’••• 

fl^i»q^i^3o.'tte»n*nav 1© 

/ fk - ^ V... J* A‘ ‘ ; r* 
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care costs to what they would have been in a counteract ual world in which no cigarette smoking 
existed. Such a comparison still must take into account the passage of time. This can be illustrated 

“"Cr* 



Factual World 
Wi th-Smoking 


193 

195 


Bom; has fa mflyhin orv 
p®™f disease 


of heart 


197 



Bom; has: 
disease 

#““3 Starts smokifg^hea^ily; normal 
diet - - - 

Still smokingj^b^iry 1 * 

1980 Contracts lut^pp^er and dies 

1981 JP* f 8 ®? F 1 ™™ 

SBSgWsBT 

1985 |B— 

1990 


1995 



Counterfactuai World 
WiihtmLSm oking 

Bom; has family history of heart 
disease 

Does not smoke; overeats 


Becomes severely overweight with 
high cholesterol 

No change 

Has heart attack 

Has by-pass surgery 

Has second heart attack and enters 
nursing home : -., 

Still in nursing home 


O 

T^^tample highlight thrt x principles. First, the cost difference in worlds with and without 

” i J 

smokirm toww iiuigs of providct ^dnaa cal care to a person may varyjby year^ .Tocompare plaintiffs’ 
health ca^^^enditures today to what they would have been bi a world without smoking, the passage 
; •- of time .taken into account.., Second, one must consider t^powbUi^^^s Sgl9&8y" r 

: ih*^th«GL<«h*™jn («.g. ovoMlins or alcohd corumjpoon) 

t - _ without smoking from a world with smoking, an Ume di^ s s^brit^M^j^on_^^._*^rd,.the 

nf rdgxartuwi^ wift no>fafcjfr 

;crd l s cu s sed :boo^in^lgypn 3;t on tei eisriJ ,haooa£ .I.DJtB n0*33*2 ni tosuBti&eutti 
fcVC r»9Sti« I shorn nwo t'nothrcH ia,f; raids a*. v \bad avsf nfglm rsldahav-witTO itft sriMtnteiti*' . 


-***•: ' 


Pf 18 


.*•; /a-. 


ft* „ 


\C ■'? ■'■! 
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The analyses in the Harrison Report da not take into account the passage of time in this 
manner. For instance, the Harrison Report would give no consideration at all to differences in health 
care ^w^omt by plaintiffs after 1980 for the person in the above example. But any valid 
comppison of the health care costs to plaintiffs in worlds with and without smoking must do so. 

kmmi 

Mndeed, more striking examples of the effect of a counterfactuai world without smoking on 


plaint! 


Jth care 




easily be imagined. Suppose that, in the factual world with 
pie above never received any health care funded by plaintiffs, 
would cost plaintiffs nothing in health care costs. In contrast, 
orld without smoking, he entered the nursing home, where he 


spent hi 
plain 


stnok^Pi^ndividual in the 
In a u^H^jPnth smoking, this 
suppose that, in the count 

gs in 1990 mafpiine eligible for Medicaid in 1991. In a world without smoking. 

Ml blfed 

uld hawd^licurrp^ens of thousands of dollars in nursing home costs for him.* The 

HarrijpnBfe^oit. however, ne yg con siders these kinds of potential changes in the pertinent recipient 

populatje|vf| changes that r^^^occur in the counterfactuai world where there » no cigarette 

smokjn <J - " 

The Harrisoi 
Oklahoma 




son's analyses 



Analyses Do Not Appropriately Apply to the Relevant 
d Population 

to'calculate National Medicaid-spedffc “smoking-attributable- 


&acrionjfflfAFs) using NMES arid then to apply these SAFs to Oklahoma Medi ca id data. There are 


" 'ievenil Sffi^^fUws'^rith^Reae analyses^ First. the application of the SAFs toOlclahomaonly adjusts 


forthe'r 


o t age,nice tTy weighting using 3 levels of£ge,2 levels of s«4’ and 2 


■ -HLZT ., hi 


that in* each year,- the Oklahoma;-Medicaid 

Ul-aTT^ JT— — 


issue discussed in Section IIX.C.I. Second, there is no attempt r to“4h?cnHc6t 
adjustment for other variables might have had, variables that Harrison's own mode! suggests are 

■ ■ 'V V r: :’v' ~ 

• V. ' .*.Tb : 
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important predictors of SAFs (e.g., poverty level, marital status). Third, even the calculation of 
National Medicaid SAFs is incorrect, as discussed briefly in Section ZII.C.3. and more completely in 
Sectioipni.FO. 

^ii$F 1. Harrison's Analyses Fail to Adjust Reliably Even For Age. Sex, Race 

r j 

PPnffi applying his National Medicaid SAFs (incorrectly calculated) to the Oklahoma 
Medicagijimplest Harrison's afe tfosisi OnlY adjusts for the main effects of sex, of three levels of age, 
and rw frnlarilAs of race, despit&the ability to adjust for all main effects, two-way interactions, and 

..p_. ■ [VJ 

three- w s qt inf ractions. becxuPPSPme overlap in the contingency table structure of the data sets. 
These fnalfTiffect-only adjus are analogous to simple linear modeling adjustments whose 

sity score analyses, which reveal the differences between the 
loma Medicaid Samples. Two analyses will be presented: the 
,cc categories, including all interaction terms; the second also 
egories, as well as linear and quadratic terms in age in years, 
informati^t t|at is available irj sfMES and Oklahoma data sets, and the three levels of race 
inform aupS5^|lack, Hispanic, t^MM^ iso available in both Oklahoma and NMES data sets. 

• pafetai presents pro p e n|i ty.soqre analyses for the first set, analogous to. Table. 2,, except that 
each row |ift^sponds to a year of Oklahoma Medicaid data from l988 to 1997 and cqmgar^ that 
data with tfjc,NMES Medicaid sample. Table 3 shows that in each of the years, the NME$.£gfdtcaid 


reliability p|gfe^investfgated b; 
NMES Mteljclid SanJEand 




n’s age and gei 



populatior ^^y iy a full standard deviation away from each.of Oklahoma Medicaid populations 

‘ * * T - v •* •'■>>£ r. ■' ,• 

along the propensity score, Moreover, although the variance redos pCui^propensity scores sun out 


moves 


. as nearly equal in the earlier years, the differences grow in time, as the; 

■ , 1 . -- \ . : ... .... r • >.1,1 .» "Jll £ Sit, i rfjji 

* ' ** .»* * '• ; 

,*.-noHLi~UO n: si A2 n**d? »e:;|br c l b»b»ra i>'z (eb''?! E nl) roil bna (ebvel C ni) sgs .xaif^dCWOlffiS''''' 


. t r 20 
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With respect to the residuals of the i t original covariates, in each year, at least 5 have 
variance ratios outside of the range (1/2, 2) and either 10 or all 11 are outside the range (4/5, 5/4). 
Clear^yMjhis^ not a good situation for simple linear adjustments. The adjustments for age, sex, and 

race should have been done within each of the 12 cells of the age x race x gender contingency table. 

msmsd 

Harrison's weighting adjustment using only main effects is unreliable. 


P 4 



| consider the 

'4 is a J l S !^ p| |» to Table 3 bu 
three l ^^^ f race, and two (i 
in Table 4 are similar to tho 
popularilmse&erywhere mordT 
with rp o the 





of propensity score analyses using the finer information. Table 
all main efforts and interactions formed by three levels of age, 
sex. and linear and quadratic terms in age in years. The results 

ble 3 but even more extreme. Now the Oklahoma Medicaid 

* 

ne standard deviation away from the National Medicaid sample 



sity^crc. and the ratio of variances of the propensity scores is between 
growing in time.iJJJyy^nie picture occurs with respect to the ratios of variances of the 


1.5 

residuals^^e original variab^ ^ht ost have variance ratios Outside the range (1/2, 2) and alt but a 
: few are (jptrijle the range (4/5,ls/4)Dwtoi vnth the discrepancies' growing in time as we move further 

MlfltltlfHKtBf m BMSnninS 

away fro^S 8 ?. ' ' F~\“ 

the simple'mai|gj^ adjustments used in Harrison's analyses to adjust the National 

Medicaiji'fl^s to Ute Oklahoma Medicaid populations cannot be misted even to adjust for age; sex, 

' ‘ tndl’addJiBii^ '* A; r.i js.ii a f-cU’i .ra.i..".* - ■ *. *. m s>< 

>ro,:c;u ^'Sk$ t '* of ^Otfier" fifijJ’dSSEn t 

Background Variables When Adjusting NMES Results to the Oklahoma 

v-;r«- rvtes* sfJJ ngtrhU?*- var }:ncu* 

L - 8 ia.«« s atg^'in' a’snSmoS kw' 4 *&i 

* „. ; >. v a. . . ..**• % 

utiiiTyoraii 32 «Mgat«ai afcWtwiftytei 

efforts of sex, age (in 3 levels) and race (in 2 levels) are needed to adjust these SAFs to Oklahoma. 

■ •;' ' 

. -JVr v-:.r 4 -n t i bj.V... ■ ■ ■ 


BEST IMAGE 
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The same answer would be obtained from Harrison’s analyses if everybody or nobody in Oklahoma 
smoked, or if all were married or none were, or if all were in poverty or none were, etc. See Harrison 
depo^on!* 195-196, 361-365, 724-728. Even if (I) the National Medicaid SAFs. as estimated by 


depojpionl| 195-196, 361-365, 724-728. Even if (l) the National Medicaid SAFs, as estimated by 
Harrison** analyses, were absolutely correct as functions of the 32 background characteristics and 

PHtti 

smok ing exposures, and (2) these were all the factors that were needed to be considered, data on the 

distri autltru of these factors te lyon tf age; sex, nee) would be needed in Oklahoma. 

kmm m^ ■ 3. . - Harripon'fAnalysci Fail to Consider Additional Medicaid Recipients In 
a Connjlgtfctual World Without Smoking 

samples in Sccti^n^piil. illustrate that in a count effectual world without smoking, there 

iividuals who in the factual world have no Medicaid expenses. 




could beJMedicaid expenses^ 

This co ^fefee beeausKggspme p^^smokers would live longer in the counterfactual world without 

py$w 8 f«§j 

smoking and incur^w tedtaai d etg gfeg gt at the end of the^r longer lives in the counterfactual world. Or 

it could^Oue to actual smJSpPwithout Medicaid expenses turning to alternative behaviors in a 

• ’ f\ % • r-t.r.stv . — 

counterTh&B@al world that uflaud»|ead to Medicaid expenses (and even earlier death) in the 

* a ... . l. ... - - ...... . .... . 

srfSSfs^t world. Other reasoeg involving nonsmokers can also be posited. 

■ffitlf* Harrison’s xn;#«s*si§nore these possibilities and thereby eliminate a portion of the 


counter 



ion that has 



.. ■*.' • 


id health ore cost in the factual world but may incur Medicaid 
(U in the counterfactual world without smoking. 




PS “* 


Harrison's Analysis Calculates Incorrect SAFs 

* ve.--' f'rn/. i "ntithicF 


.[uhuvj ' :J 


Cn 

f-» 

VO 

CD 

Ch 

Vi o 

O', 

CJ 


**,..*w rtr-r".a evrasA t‘mumaie . s 

Ham son’s analysis makes a statistical mistake when calculating Medicaid SAFs, even 

■r*.-' '• -tv i---:-; ». t-'t. I r» e J o' ■•’.'■■jri t?Si 

assuming his models are correctly formulated. The problem is that the SAFs he calculates are 

Lii.uunKMiU’t i?•<>.' ,r_ 1 io/KW*ti>OJv'9‘iovu boflilnt'bi 

national SAFs, not national Medicaid SAFs. which is the objective stated in his report (e.g., p. 12) ;... 

"jo swarm vsv-.ut mAr aorlbrurt v‘r:‘*~ or>;: jsf.irvun pnfttvnwt 

and in deposition. See Hanison deposition et 163, 723. See Section QLF. for more dbtailsonthis-.. 
.22MVJ nl elqmu bieofetfM adi t^Z ImoiisV? 8 shilfc^$jiarhta^^ 

Criticism. 

•; ‘JPSS&S i‘V 


a 22 
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D. Harrison’s Analyses Fail to Explicate Critical Assumptions 


When constructing a model for estimation in a counterfactuai world, assumptions must be 
tnadfe. TheSe assumptions must be stated explicitly and evaluated. Harrison's report leaves critical 


assumptions implicit and not evaluated. Moreover, even when estimating factual world distributions 


froimevailable data, assumptions are often needed and should be explicated. These statements of 


son Report, and appear to bet replaced by implicit statements 


« missing fiomp^amson Report, and appear to be replaced by implicit statements 
the data setafthafJle available to Harrison, even with their limitations; are adequate 
np fflM iluse they are avail able (a e Harrison deposition at 116. 121, 165-167. 675, 732, 738-739). 

r****! & y i 


jse they are avail; 


Harrison’s 


IffsPuSi st rued 

o 

to a 



Hamson deposition at 116, 121, 165-167. 675, 732, 738-739). 

4 t 

Fail to Consider Effect on Ocher Health-Related Behaviors 


Section III.B., to estimate the effects of smoking in the factual 
lj|vvorid without the existence of smoking, alternative health-related 
xenu in the counterfactuai and factual worlds must be considered. 


■ riii; 


bchai$ pp4 hat might be di |EBnatt| in the counterfactuai and factual worlds must be considered. 

Behavi d%^ uch as overeatir^plfthe use of illicit drugs must be at least considered. Harrison'S 
WiW-? ;; . msziitsW z: :.t •. '♦ ?>.* 

analyse^jgg^licitly assume thkrttftre will be no such changes. This Is done by keeping all values of 

m mb i&kg&r ’ ‘ . ' ' 

bickg .^variables fixed aftKetr observed values in the factual world when calculating individual 


at will be no such changes. This Is done by keeping all values of 


NS: the level of|!^j|gJgdaid expenditures claimed given that there is no smoking history. 


his or her.BSier (unchanged) characteristics, and the fact that he or she made some expenditure claim 
| [in the jtetu a l world]." (Hamson Report at 26).' 

! gMfy IxmoMil uMlnatO iityltnA (‘novrucl .»■ 

j ItaSasf Harrison's Analyses Have Statistical Errors 

j h*Ofbs)d gocBink 1 ).osrt w. sisteim la^juusk c ia**;-,- , 

In addition to the conceptual and modeling shortcomings and errors in the Hamson Report 

j - i-.u,uzuis t>A t%AZ »d: ^erU et mskU/X} oslT AsttliwnoS rif tiri . 

identified in previous sections, the Harrison Report contains several fundamental statistical errors, 
a ' .3 , 3.s) rjoqneifi wi«a8^7W»idoedl ■v&iu* r jt\&x htaottoaM tcnoiwn tqp .eHAJE-fc^oitorr . 

including incorrectly handling missing data, incorrectly handling the complex survey structure of 
£i* no *HM39l h s*n* noitjaoqeb notrmReeS ^okboqsbei-hij*.: . , 

NMES, and inconectly calculating National S AFs for the Medicaid sample in NMES. 

.... • ...... V. . : 

■ •• ym™. •. - -r^Sr IMAGE 

Iill|i ''li ij ii y lil ii ii y in f i iTli iliiirjiiii|ril|7iir)rif|'iilt Iiiilil liyiliii mm ill m fiiln'ilm 1 1 ,jiijTriQO I 
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i* Harrison*> Analyses Mishandle Missing Data 
The Harrison Report confronts two major lands of missing data problems. First, some of the 
peoplfl^in the National Medical Expenditure Survey (1987) did not respond to certain questions, 
includ H|gjp8istions about their smoking status, marital status, and medical expenditures. 

1, none of the individuals in their Oklahoma data set has any information about most 



in the national mo del. This second problem of missing data is handled by Harrison's 

majfEL 


variabldPh 

report py^Sn mg that the maJTefEects of sex, two levels of race and three levels of age are adequate 
to des c^^^^ lahoma Medi The inadequacy of this approach was discussed in Section 

ULcaMH 



rt h^ion handles the foatjsarb blcm. missing data in NMES. using three approaches,' all of 

which are^^ferally FirJiPiPilcepts imputations done by the Agencv for Health Care Policy 

pH* ^ 

and Research wi^c^^uestill^n^f particular importance, over 50V* of the critical medical 
lituhs&tu was imputed 



tes arbitrary im 
rt at 24, footn 



gency using a relatively crude, out-dated, hot-deck procedure. 

for a variety of categorical variables. For example, see the 
here a missing value to the “married'* question is effectively 
married, so that of unmarried respondents are coded identically as the group 

pendents. Tins method leads to biased adjustments for the actual marriage sutus 
fact, using this definition of “married," it appears as if fewer than 10% of the 2237 

■ -“NMES KfjUfcaid individuals used in'the analyses are ‘‘married"; the reas&lMs'that of these'2237 

Mgtsf 

inidividuaJ^'1754ireimputed to be ttot married.^ w * ' r '' 

"Even ignoring the'issues of biased ^estimation created by theser two methods'of handling 
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1 


The third method used by the Harrison Report to handle missing data is the so-called 
“complete-case” method. After filling in data by the first two methods. 719 of the 2,237 NMES 
Medica^HndividuaJs still have missing data. Harrison’s analyses simply discard these 719 individuals 

and basl^lfCnaiyscs for SAFs on the remaining 1,318 complete cases. If the 719 nonrespondents 

r_ j 

only r& i ' plfflfp differed from the 1,518 complete cases, such discarding would result only in a loss of 
efficie 




0 

why the atopk i of missing data 


estimation. Hfr wgvef . when the complete cases systematically differ from the 
nonrcs flftf tfoet*. basing analyst oqjhe complete.cases typically leads to biased estimation; that is 

pafetw|. < » ived so much attention in the last two decades in statistics. 
See Litt^fl^A. and Rubin D. | Q|f| 7) Statistical Analysis with Missing Data. New York:: John 
Wiley an fcs& tms (with R.J.A^&||), translated into Russian in 1991: Finansy and Statistika 
Publisher^ ? cow, JSn ei translator, McLachlan, G.J. and Krishnan, T. (1997) The EM 

Algorit hm and EMiM^hs, lfwfe&l|ork: John Wiley and Sons; Rubin, D.B. (1987) Multiple 

E?™5 • hsamd 

ImputatiotiTfor Nonresponse ffHSw^eys. New York:-John Wiley and Sons; Schafer, J.L. (1997) 
Analysis ^^^ cofnplete.Muiti ^^^^ Data. New York: Chapman and Hall; Tanner. M.A. (1991) 
Tools for dent ist teal fn/erenc^^0served Data and Data Augmentation Methods , New York: 

pWP k TnTrrrTr i 

Springe^^yg. There are .valid ways to address problems of missing data, and the 

Harrison apises use none of these but rather old, largely discredited ad hoe methods. 

T of exa mine .whether .the.compiete-pasc .analysis used, in the Harrison .Report might, be 
; «cceptablc|f%n»idemithe difference betweenthe. U5T8.resptRtjdents.and riie 7no # nresppp|dents 
with respect to the set of 25 variables fully observed,forJ?oth groups.}.-Table 5 jmmmariziei,these 


differences:betwtOT^ecompletejcaswyandJtb<UJQft09P9ndCTt?j»;5aCh«“J?Vlej(.> t b9'.OosL^S?. ul 

expenses are nearfy twice W*. 


'Ai *.'-■ ^ 'vVa'»-’ s' ., . .* • 

■ / :, JiW ’ . 


) 


o 




Bf-ST'IMAGE 
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expenses paid by Medicaid ire much larger for the respondents. Moreover, the respondents are twice 
u likely to be married, less likely to be black, more likely to be employed, more likely to own their 
own Xous& more likely to be treated for an alcohol-related disease, etc. 


Nati 


r „ —early, the complete-eases as defined by Harrison's analyses are not a random sample of the 
%iTMe 


Ham lc^l; a nalvses do, canJo frabefe neral 


anal 



dieal Sample in NMES, and discarding the one-third who are nonrespondents, which 

lead to valid inferences. 

nalyses Ignore Complex Nature of NMES Survey 
with unequal weights obtained from a multi-stage design. The 

to ignore both the weights and the multi-stage nature of the 

% 

is a simple random sample of the eligible national population, 
esign effectively leads to systematic over estimation of precision. 

derived assuming a simple random sample are systematically 
nee intervals are systematically too short. Consequently, none 


2. Ha: 
IS is a complex 
fie Harrison Rep 
survey, pm*' “ stead act as i 
which itls.na t The stearins 

pssf m&m F**® 55 ? ^ 

For example* boomrap start 
too smalOnd the associated 




of the reJ&Jt&n the Harrison Rppoiihave the statistical reliability claimed for them even if there were 

J 

no otherffiiblems with the aj!|wi. 

i f. . .3 

3. The SAJisJ^ alculated in the Harrison Report Arc Incorrect Because 
They IwnrniL 




: Apply to the Medicaid Population 


fwwon’i analysis correctly attempts to focus attention on a Medicaid population rather than 
the emirejllional population. This focus is clear in the report’s definition of SAF-M to apply only 

£_ j 

to MedidSfr cmmami at 12 and in Harrison's deposition at 163, 723. But the analyses that calculate 
the estimated values of SAF-M used to allocate medical expenditures, create National population 
SAFs not Medicaid population SAFs. These analyses, described at 26-27 of Harrison's Report, are 
based on a statistical model. The error occurs because the first equation in the statistical model. 


rr 

'"26 


BEST I'M AGE | 

traRtfeaevi 
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predicting positive Medicaid expenditures (of each type), uses the NMES national sample rather than 
the NMES Medicaid sample. 

p'To summarize the differences between the NMES Medicaid and non-Medicaid samples, I 

ealculalli§P?opens«y scores using the 43 predictors in the Harrison regression model and Mcaid, the 

f T i 

MedicIlfflfcRcator created in the Harrison Report. The difference of means of the propensity scores 



the ratio of Medicaid to non-Medicaid variances along the 


is ove^glr standard devia 
nrone aritv ico re is only 0.3CT FcuLthe residuals of the original predictors, 10 have variance ratios 

i " 

, 2], api$%a' 


outside Bth s iangc [1/2 

lZEw ' aJ 

NMES^Meoreaid sample is ve fyjf|§ re 


predictoipga^ples used in th< 
j|!f%Sutt of^SfTO 
overcsiimatipn of^t^Pl^knd 
ar SWRcomings in the 


errors or 
the repa 



pncJusions are ii 


O 



ve variance ratios ouuide the fjmge [4/S, S/4], Clearly, the 
nt from the NMES non-Medicaid sample with respect to the 
in the Harrison Report. 

ional SAFs rather than Medicaid SAFs, appears to be biased 
ed dollar amounts. Consequently, even if there were no other 
analyses in the Harrison Report, this error alone means that 
4 unreliable. 


- j:v . 


•• i \ ■_ ; ■ * ■ r-\. ■ • 

•,-,Ua t »r> r tt.iu: thH .*:ot;*juqoa-»ft.nr i*>?. awnc-s-ii 

a •;: ii .was;,ft :Y. -wia* tsi itr^Zl X- ta*.'rr:aLU 
' sis L .-ft-’L* £ tc vatufct* f.st. *«-•*?. 


-*.5 . Mvr'iJtSTifj '£-c£ vs Uadrr^ii- .cscviaiw »s»ri7 ?TA£ n nn t l t K>o q fc;sguhi>M inn *4A«d 

iaiam ia.>:s^u..v. »:» r.*oiiiups aid sa: acusosd tui 30 cooT».wfT Jebont IwiatfS* c-fto toattl 


■ ■ 'V.'-’.;-V 


) 


'l 

y 
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TABLE I 


n*‘ 

fines bo?”;; 


xisnoi 


PERCENT CTIONS IN BIAS USIN^REGRESSU 

me 

Konsl _ 

tionsin the two groups, and R is the ratio of tfufvariances of x in the two groups 

% 






m 



ntv 



62 

298 

48 

-304 

80 

146 

72 

I 

100 

100 

lOl 

101 

101 

101 

102 

in 

298 

62 

-304 

48 

146 

80 

292 


292 

90 

123 

88 

170 

96 

M3 

102 

139 

102 

101 

101 

104 

104 

102 

102 

108 

108 

72 

123 

90 

170 

88 

113 

96 

139 

102 



\ 


• ■ i 

Note: If all bias is removed by regression adjustment, then all tabled values would be 100%. A negative number means that the adjustment, 
instead of removing bias, creates more bias in the same direction as lire original bias; 0% means that the adjustment docs not accomplish 
any bias reduction; a value larger than 200% indicates that the adjustment increases bias beyond the original amount but in the opposite 
direction. 


Source of values: Cochran and Rubin (1973). “Controlling Bias in Observational Studies: A Review," Sankhya Seri r<j A. Volume 
35. Part 4. Tables 3.2.1,3.2.2, and 3.2.3, pages 427,428. and 429. 


• 2 % 
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TABLE 2 


#u XjJEShMATEOPIVi 

;T??K3C. M (Cf , IN 

r !"-‘ ! • .*■■- 



NpO 
ARJATES 



• B=Bias, R=Ratio of “smoker** to “«)fttpwker M variances; 
also displayed is the distribution of the ratio Jj^rJ|nces in the 32 covariates 
after adjusting for the propensity score 


•.j 



UMPHREY 

JL * JUL JL JBL JIbL m Jm,. 


“Nonsmokers* "Smokers" R for covariates after adjustment 


Definition 

Definition 

Sample 

Sample 



<% 

>'A 

>% 

>% 


of Nonsmoker 

^ of Smoker_ 

Size 

Size 

0 

R 

and 

and 

and 

>2 

' *■ ' * ■ , 







<% 


<2 


Not <5untm 

CurfSil 

926 

592 

0.91 

0.92 

0 

2 

22 

4 

4 

i Never 

Ever :• - iv: 

687 

831 

0.94 

0.51 

0 

4 

21 

3 

4 

Unexposed 

Exposed • 

482 

1036 

0.84 

1.27 

0 

2 

19 

6 

5 

Unexposed 

Passive 

482 

205 

0.89 

1.00 

1 

2 

21 

7 

1 

Unexposed 


482 

< 239 

0.B8 

1.35 

0 

3 

20 

4 

5 


«rw 

1 






. 



—! ■ Unexposed 

Currentii^rji- 

' f,482 f. -:: , 

592 - 

1.18 

0.61 

0 

3 

18 

6 

S 

It:' 1 .% . '• 

kt'ir' :4- if;-:! 

i , • i 
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TABLE3 




1 


•S) ' 


I v:° 

ESTIMATED PROPENS! 


VERSUS 01 



B*Bus,’R=Ratio of NMES to OK variances; also displayed is the distribution 
of the ratio of variances in the 11 covaria t^a^^adju sting for the propensity score 
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TABLE 4 


VEon themc. 


EDBBBas 



B^Biai, R-Rilio of NMES to OK variar 
of the ratio of variances in the 19 covariaig 


AGE SQUARED 



> displayed is the distribution 
[justing for the propensity score 
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TABLES 


Analysis of Missing Data Mechanism for Medicaid NMES Subjects (July 26, 1998) 
Comj|irisor|of 26 Fully Observed Covariates for Medicaid Respondents Nonrespondents in NMES 


Number of Respondents: 
NumtasslM^Tonrespondents: 


1518 

719 


qs . 

Reloandents 

Nonrcspoadcnts - 



Mean r Standard 

fT^Jpevia ti on 

Mean 

Standard 

Deviation 

PValue 

ldnts#lfe^ 

0 728 I - ^ 1.649 

0.412 

1.298 

0.000 


(log | 

lexp 
(log t 

Itotsp 

(log! 


*8 e M i 

hisp^w 

(Hispad^ll 

bUck fj 
(Black )~ 


expense paid by 
6.274 

ense) 





id) . 
2.644 


il 5 {KHHB} 2.623 

pensejp«tt by MPfccaid) 



bject) 


incity 

(L» v e *«]ls®j') 


insub 
(Live i 


Q. 

Aim) 


! 4T*99 

0.144 

0.401 

0.226 

0.109 

0.349 

,* » 

0.377 

0.276 

0.181 



t.- ,, 

r -\ 



ownhome 
(Own Home) 

job 

(Currently have a job) 

vet 0.051 

(Whether Veteran) 


20.572 

0.351 

0.490 

0.418 

0.311 

0.477 

0.485 

0.447 

0.385- 

0.221 


4.954 

4.407 

49.693 

0.147 

0.459 

0.253 

0.054 

0.388 

0.346 

0.236 

0.125 

0.042 


-J2 

.t 


3.527 

3.334 

21.570 

0.355 

0.499 

0.435 

0.227 

0.488 

0.476 

0.425 

0.331 

0.200 


0.000 

0.000 

0.062 

0.811 

0.010 

0.163 

0.000 

0.076 

0.160 

0.043 

0.000 

0.303 


r 
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i ) 


Respondent! 


Variable 


Mean 


Standard 

Deviation 


J 3.051 1.960 

bf members in the family) 


pincpiH^ 5.279 4.901 

(Personal Income Variable in 1000) 

fincClS 14.432 ,4.451 

(iHnlfTOTnncofnc Variable ^lOOO's) 


finery ,4.432 

(FiniifcTOnEncome Variable 

suth^5 0.352 

(So u^|m^.egi o n) 

neasi - 0.22S 


(Northeast Region) 

west 0.174 

(Weste^^egion) msM 

hsch 

(High School) • ^ 


CiMl 


0.47S 


0.418 


0.379 


0.498 


0.105 


j cation) 




0.076 



idition) 

| 0.2S5 

ttus for Medical 


0.306 


0.266 


college! 

(CoUeg| 

hlthconc 

(Health’ 

povty J 
(Povero 


idrugs W 0.006 P""HD.077 

(IndicatorDrug related disease) 

ialc r J. 0.228 6.420 

(IndicattOTiir Alcohol related disease) 

G .Cju V 


tVonrespondcnts 


Mean 


3.231 


5.371 


16.220 


0.371 


0.246 


0.167 


0.516 


0.093 


0.060 


0.280 


0.003 


0.175 


Standard 

Deviation 

2.095 


5.974 


18.354 


0.484 


0.431 


0.373 


0.501 


0.291 


0.237 


0.449 


0.053 


0.380 


P Value 


0.053 


0.719 


0.022 


0.386 


0.280 


0.680 


0 328 


0.388 


0.137 


0.222 


. 0.258 


0.003 


; r,.V-./r 
r i - r J \ 


rr; 0 


r 


fo'o'i Vi:;:? rru'J) 

5*0.0 Jav 

(rune?’:V ’•jrimdW)' ■ 
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DISPLAY 1: List of 32 Covariates Used By Harrison 


AGE 

AGESQUARE 
H3SPAMC \ 

blaciL^J 

MAL£ f~ . 

MARR^^ff 

iNcrr y^ 

INSUBpSi^ 

souT^m^ 

NEAST ^P 

WEST 

OWNHOME 


5>> 

BM3 pQ P"i m 

bmisq^i 

HIBLOC^3 

SEATBEI^) 

PHYS A. ' ^.w-v.v.v.^. 

TAKERJsZS fc~i 

HLTHCoKr pm®4 

ENGL Pyj - '' : ■ ' ' " '' 35 
NFAMILYgakl . - ' -• ’ 

HSCHOOD"^;. - . .. . 

COLLEg5M 

“ ■. a,-*r-*v- *i « «.. *-•• . ’>t T t ■ - . i • ~ 

u MWWL; ■ - ■•-•••*• • f“ - «• » » ■■ ■ * ' 

PtNC 

.. ■ M. . - s v? :■:^ .".♦.. '• 

PINC2 

FINC 

FINC2 

POVTY;r ;-.~ «onn20 iii'/iszs. rroiziPigs: :vi cctlv. ' u.'.; 
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i tssemA L||yX^p| 


P 
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5*. * *■■ 




IAL ^a-u i^arJw 23ii2fta»-si*rf:i l. rrot j 

IDRUGS 

3P cu.taadaror/ bauoijafate* srfj lo fioaadnjj-Lb <s; 


ry .'.SiVticrr. r .t: 


L* nriS5.rr.'*;?;fo 
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SUPPLEMENTAL REPORT OF DONALD B. RUBIN 


January 10,1998 


initial July 19, 1997 Report in this matter expressed three broad 
cjjp^ijis concerning the June 2, 1997 Report by Zeger, et al. (the "Zeger 

« '*). None of those opinions have changed. However. I have done 
imited work o&jfipicts of those opinions that further supports them, 
is describeff4|t-i$us SupplementaTRepon. 

mud HN 

LJILj 
pmumi 



y J The ^gfr RdpffiNtmploys regressions to adjust for background 
chS^ileristics of smok^jand nonsmokers, such as their race, income, and 
edtftLflional level, regressions rely upon certain linear-additive 

assumptions. My Ju fy l9) leport advised that the consequences of making 
thqa5 linearity assumPj^ ons should be addressed. Because these 
c^^^uences are se ga^ve to the degree of overlap in the distributions of 
the^P^ackgronnd variables between smokers and nonsmokers, I proposed 
us khg p ropensity score methods to examine the overlap. Without such an 
anai^ais, one cannot have any confidence in the Zeger Report's regressions' ; 
abmryto adjust reliably for background differences between smokers and 
nonsmokers. (July 19 Report at 9-10). 


The statistical literature warns that regression analysis cannot reliably'- 
adjust for differences in background characteristics when there are^"^ 
substantial differences in the distribution of the background variables in the 


two groups. - - ■‘•s. 
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For example, William G. Cochran, who served on the Advisory 
Committee that wrote the 1964 Surgeon General’s Report, wrote extensively 
on methods for the analysis of observational studies, as summarized in coy 
chjff>ter\>n his work on this topic in W.G. Cochran's Impact on Statistics 
(Ra^f, ? f84, John Wiley, New York), In 1957 Cochran wrote: 


frW^O,000*S12j 

allegedly IppSia to 
group has 
tis leveL" G 


- j when the x-variables show real differences among groups - the 
which adjus tthegt is needed most — covariance adjustments 

fem^r egression adjustments] involve a greater or less degree of 

r 4? r*T" i 

&jara^>olation. To ilp aM aa ^ 6 by 40 extreme case, suppose that we were 
for differe|ff^Mj& parents’ income in a comparison of private' 
az|d public school and that the private-school incomes ranged 

the public-school incomes ranged from 
Variance would adjust results so that they 
income of $8,000 in each group, although 
Nervations in which incomes are at or even 
William G. "Analysis of Covariance: Its 
and Uses." ^ cs, VoL13, pp. 261-281, 

65, he wrote: .... 

••tO original x-distribudons diverge widely, none of the methods 
[ejg^giession adjustment] can be ousted to remove, all, or nearly all, 
th ^Slf . *^ s discussion brings out the importance of finding 
comparison groups in which the initial differences among the 
distributions of the disturbing variables are small w 

. , tw; »■;:';^rv:c; :.a; ^ ' : 

And in the same article: n-- : -yx\ ia- - ■> 





liST 


i , *rvs? , nsr^Rwr. - 
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"With several x-variables, the common practice is to compare the 
marginal distributions in the two groups for each x -variable 
separately. The above argument makes it clear, however, that if the 
[form of the regression of y on the x's is unknown, identity of the 
Whole multi-variate distribution is required for freedom from bias." 
< Cochran, William G. "The planning of observational studies of 

« m population^.” The Journal of the Royal Statistical Society, A, 

128, pp. 234-^JlM 

hmm 



egr essio n) to hemistw 


particular, 

g cpwiim^ practice mus 
(wh^her by ordinary 
regrli|ioH) to be 
d 

tw o grou ps rfiisnroe rc, 
as ^^p^med in die Ze 
ic^^s are 

Q 

Prt The differem 

pp 




cone 



three basic distributional conditions that in 
ltaneously obtain for regression adjustment 
gression, linear logistic regression or linear-log 
v. If any of these conditions is not satisfied, the 
stributioos of background characteristics in the 
as substantial, and regression adjustment, such 
oxt, is unreliable and cannot be trusted. These 




mg 

d deviation ai 


the means of the propensity scores in the two 
be small (e.g., the means must be less than half 
the situation is benign in the sense that: 
seributions of 'the background characteristics In both groups are 
trie, and (b) the distributions' of the background characteristics 
in bo ^%T>u ps have nearly the same variances, and (c) the sample sizes are 
approximately the same. - * ' ' ’ 

2. The ratio of die variances of the propensity scorn in the two groups 
must be close to one (e.g., 1/2 or 2 are far too extreme). 


http ://legacy. library, ucsf .edMid/peoqQf el 


BEST IMAGE 

industrydocuments.ucsf.edu/docs/xygl0001 




51956 9778 





4 


3. The ratio of the variance of the residuals of the original covariates 
after adjusting for the propensity score must be close to one (e.g.. 1/2 or 2 are 
fartoo^extrerae). 

Specific tabulations and calculations relevant to these points can be 
or example, in Cochran and Rubin (1973) "Controlling Bias in 
tional Studies^, A Review," Sankhya, Series A, VoL 35, Part 4, pp. 
); Rubin (197if^*%e Use of Matched Sampling and Regression 
ts to Remo>p&Jtil|E in Observational Studies,” Biometrics, 29, pp. 
); and Rubin \Wff) "Using Multivariate Matched Sampling and 
on. Adjustcnen t^^Ec ntrol Bias in Observational Studies," Journal of 
the Ame rican Statisdce ^Ju&o ciation. 74, pp. 318-328. 



parti- 


low. 


a. i j - 

F“1 


randc 



Ci 

re gressio n on random 
miring overcorrectinj 
origrartaias [when th< 
that hrtille implies thJtTwi 
one regression 
forMH^ Relevant res 
guidelines' 




and Rubin (1973, p. 426) state that "linear 
gives wildly erratic results . . sometimes 
en (with B - 1/4 for e x ) greatly increasing the 
of the variances is one-half]". Table 3.2.2 in 
the ratio of the variance of any covariate is 
y overcorrect for bias or grossly undercorreet 
that table are summarized here as Table 1. 
able 1 ad d ress Zeger's regression-adjustments 
or linear log scale because they too rely an additive effects 
variates (for discussion of.this point, see e.g., Anderson, Auquier, 
es, Vandaele, and Weisberg (1980), Statistical - Methods for 
Comparative Studies, John WHey, New York, p. 164).. . 

• •- ’ . - * . f * ' . • .. 

f-f*. .TV' .-'.- T \ < 


f* v*•rjvr 
; n . -\ '■> \ c V*. V'; : f ‘ - 


r . . ^ 


■* IfiK'jf 
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fer regressions 
■of variables); 


propensity scores; and; 
in tifPEeger regressic 
vanishes using a step! 
(seeQ^play 1 for tb 
<wh re estimated j 
so ifpPthey'were line* 


Since ray July 19 Report. I have used propensity score methods to 
assessvwhether one can conclude that the adjustments made by the Zeger 
S^ jp grs regressions can reliably adjust even for those background 
d ifferen ces included in the Zeger regressions; they c ann ot. 

0k,$0 first calculate d^three alternative propensity scores between smokers 
sJS^x^smokcTs in eap^^nhe six age x gender groups defined by the Zeger 
rIS®S{J The propensft y^scb rea were all estimated by logistic regression as 
fo£$&0$, Set 1 — usii^oSy the 25 background characteristics included in 
thlFZefer regressions '“main effects" propensity scores (see Display 1 
for t isr o f variables); using the background characteristics included 

in tftfeZeger r ^^ sioi fe^| || all two-way interactions (products) among those 
b^llffbund liabler^tius the square of age — the "all interactions" 
pr^mpfity scores; andfsef^— using the background characteristics included 
in ttffPEeger reeressioftradd selecting two-way interactions among those 
vanishes using a ste p ^ wsy jprocedure — the "step-wise" propensity scores 
(see^ ©flplay 1 for d je int^ ractions included). These propensity scores 
/w hip^ re estimated tlroj^ ilities) were then transfonned to the logit scale 
so iffiPthey'were linear m the original covariates. This transformation was 
donejl&r three reasons. Fust, relative to the raw propensity (probability) 
scalq^ne linear propensity '(logit probability) scale is mote relevant for 
s-sserfipl the efficacy of linear modeling adjustments (including those based 
on lijgiiiyi' regression and linear-log models). - Second, the linear propensity 
scores tend to produce more benign distributions with more similar variances 
and more symmetry, because they axe weighted averages of the original 
covariate values. And third, die linear propensity scores are more directly 
related to results in the literature on adjustments for covariates based on 
linearity assumptions. 
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I -next calculated, for each set of (linear) propensity scores in each age 
X gender group: (1) the difference in the means of the propensity scores 
between smokers and noosmokers: (2) the ratio of the variance of the 
jj ^ooe^ sity scores for smokers and nonsmokers; and (3) for each of the 
opginal covariates, the ratio of the variance of residuals of the propensity 
sforesPfor smokers to the variance of the residuals for nonsmokers (i.e., the 
resTtspais after adjustin g fo r ~ the propensity scores). The results of those 
ons are found'in Tables 2-4 attached hereto. 1 

I^^S omparing th c&ec csults to the benchmarks for reliability in the 
statistical literature, prfpplear that the Zeger regressions do not adjust 
relijl^^pven for tholipiikgroimd characteristics-between smokers and 
norijfmfrkers m^ idetOptaeir models. First, focus on the main effects 
model. In a&p&^age bfc-amai cells, the distributions of covariates fail to meet 
nJ5nflihg three or fai^^^eet any of the three reliability criteria. Consider 
the $k)° s of the prop^jaf^ scores: rounded to one decimal place, all of the 
six in Che propimtcy^scorcs are at least one-half a standard deviation 

apa$rtjd the simatioillliifnot benign; three of the ratios of the variances of 
th ftsacao ensitv scores Pf”lp otit one-half, one is less than a quarter. Now 
the residual/oftae original 25 covtriates and the redos of their 
ts among smokers and nonsmokers'in each'of the six age x sex 
each of the six cells there is at least one adjusted covariate with a 
1/2, in one cell, four such covariates. And in each cell except 
one, there are at least eight ratios outside the range from 4/s to S/4. 



* The computer programs sad output ire also Attached in electronic form. 


a The benchmar k values riven in Table i ere aoesrete for difference of and the 

ratio of variances of the propensity scare. They are not u accurate tar the residuals of the 
covariates because most of the covarittes are discrete. Nevertheless, the mere the redos 
of variances far the residuals differ from 1, the larger the difference b etw e en the 
distributions of the residuals in smokers sod nonsmokers. 
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The "all-interactions" and "step-wise" propensity score analyses yield 
qualitatively similar results, although even somewhat more extreme For 
instance, with the "step-wise propensity scores", in four of the six cells, the 

£ 2 - *• « .4 . i « « i * 


St^sc propensity score means differ by more than half a standard 
df viatip n and in the other two cells differ by nearly one-half a standard 
deviation. Moreover, in three of the six cells, the ratios of the variance of 
U^.ensity ,cor« feg^uch to, <h*> one-half standard deviation. And 
f or the residuals of thfcov 


more tjjpn half the rad 
arQl^idc the range 1, 


covariates, totalling over all six cells, we see that 
outside the range 4/s to 5/4, and more than 15% 


subgroup If^&rest comprises the males, ages 19-34. In the 
TAj^ffkeportlSv aS|mi for about $720 million of the $952 million in 
claimed sm^f^^attrpgSBi^ diminished health status expenditures. Focus 
on^S group and thiliSilfile main effects model. The dif fe rence in the 
me^yibf the propensa ^lsp res is nearly a fell standard deviation, and the 
rati the varianceshsTthe propensity scores is .65. Moreover, for the 
resijpjflfs of the covai ^^^ two of the variance ratios are less than 2, and 
are putside^^renge 4/5 to 5/4; although not .displayed, the xatios 
of t hcsre siduals range nom nearly 0 to nearly 2. The statistical guidelines 
indi|tfed in Table. 1 mean that, in this situation, Zegcr's regressions could 
grosJEovcreonect or grossly undercorrect for bias. 

In summary, Eger's regressions do not reliably adjust for background 
characteristics between smokers and nonsmokers in any of the six age x sex 


cells. 


’■tuvsa to ‘ 

*4' it»: •: •.■ 
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.. &• »• 

■■ ’f- jl i.-". . ,scr,'( is-iHb r 
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y my July 19 Report, I expressed my opinions that the Zeger Report 
s neither the sums plaintiffs expended as a result of defendants' 

S rongfui conduct nor the sums plaintiffs expended as a result of the 
of smoking (July 19 Report at 1-10). My opinions were 
y confirmed sulate^ucntly by Dts. Zeger. Wyant, and Miller in their 

dekggggrans. 3 

my July 19 0k0§n, I also described the correct approach to 
estimating the sums -p^Hfs expended as a result of defendants' alleged 
wroitjpulsfconduci or thjpjpcrts plaintiffs expended as a result of the existence 
of^^^Png nfti pS l9 Kgport at 1-10). That approach was subsequently 
effectively qflB SB f l id by^k^amet in his deposition. 4 

"O ■ ^ 

^mv depositiox ^^wi| s asked to Identify data sets that could be used 
to he||> jastimate the sudss°pla|ntiffs expended as a result of either defendants' 
allegpilM'ongful condi^SIWFthe existence of smoking. I have since started 
the nrnrasx of trying tcff lpn tify some data sets that could be used to do so. 


Qfoimy depositic::J was asked to Identify data sets that could be used 
to hef^gstimate the sudis°pla|ntiffs expended as a result of either defendants' 
allegp^^rTongful condi^SIWFthe existence of smoking. I have since started 
the T nroce ss of trying te ffl^ro fy some data sets that could be used to do so. 

L I J 

Som^mS. data seu thlflfF|ear to be relevant are identified in Display 2. 
Sc nit raicmational data sets that appear to be relevant are identified in 


Pk 


3 Zeger Dep. at 332,329; Wyant Dep. at 237, 254; Miller Dep. at 250.248. 

4 Samei Dep. el 395-396. 
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Also, a survey of Minnesota Medicaid recipients could be designed 
that collected aot only complete information of the type in NMES and 
BRFSS, but also information on reasons for smoking behavior and how it 
^cted by past and current tobacco industry conduct, as well as how 
rduscry conduct might have affected alternative behaviors (e.g., 
or eating) in the absence of smoking. Moreover, assumptions can 
be J^piilated under wjki eh r elatively straightforward analyses of accessible 
dajp^be used to estip^^he causal effects of alleged misconduct. 






Professor Donald B. Rubin 
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'I'ABI.K I 


PERCENT REDUCTIONS IN BIAS USING REGRESSION ADJUSTMENT: 
x is normally jHrjiM .PM bylo|<£iUe 

non-lineaieiifinsm^;%^. ielSiilBb^3MMlart#v%ionsjB^gJh^^is jjjluJ 
distributions in the two groups, and R is the ratio of the variances of x in ikie two groups. 


nijtke 



= ciptrt) np( i/2) cip(k) «p(-it «pt-*/0 «P<*> «*PV*1 


—n»4fiis«s— —ourtal 
c*rf 02 i «p(-*n) ci>|t<h) <*p( i) 


2 

62 

298 

48 

■304 

1 

too 

too 

101 

tot 


298 

62 

-304 

41 


80 

146 

72 

292 

90 

123 

101 

tm 

102 

102 

101 

101 

146 

80 

292 

72 

123 

90 


88 

170 

96 

113 

102 

139 

104 

104 

102 

102 

10K 

14)8 

no 

88 

113 

96 

139 

102 



Note: If all bias is removed by regression adjustment, then aU tabled values would be 100%. A negative number means 
(hat the adjustment, instead of removing bias, creates more bias in the same direction as the original bias; 0% means that 
the.adjustment does not accomplish any bias reduction; a value larger than 200% indicates (bat the adjustment increases 
bias beyond the original amount but in the opposite direction. 


Source of values: Cochran and Rubin (1973). "Controlling Bias in Observaiion.il Studies: A Review", Sankjvya, 
■Series A . Volume 35. Part 4. Tables 3.2.1,3.2.2, and 3.2.3. pages 427.428. and 429. 
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IABLE 2 

Main Effects Model 


* |., f— «« - . , JSBb. y 'j^k' ^ 

11 estimated propensity sabres on the logit scale lor smokers and nonsmotars: Main effects model; B~Bias, R=Ralio of smoker 
i f , lo.nonsmoker variances; also displayed is the distribution of Ihe^alio of variances in the 25 covariates after adjusting for the 
..propensity score 1111 

> ’ JLJLJL 




mm 

1PHR ¥rV 

Sex 

Age 

Nonsmokers 

Smokers 



JLJL. 

f 

JL, 

for covarlates after 
adjustment (B=0) 

* 

Group 

Sample 

Size 

Sample 

Size 

B 

R 

<x 

>v, 

.and 

< 4 /, 

>% 

and 

<\ 

>% 

and 

<2 

Female 

19-34 

2417 

1633 

0.78 

0.71 

1 

2 

16 

6 

Female 

35-64 

2859 

2324 

0.46 

0.53 

1 

2 

22 

0 

Female 

•65 

' 2101 

935 

0.64 

0.53 

3 

4 

15 

3 

Male 

19-34 

2017 

1453 

0.88 

0.65 

2 

4 

14 

5 

Male 

35-64 

1688 

2696 

0.72 

0.49 

1 

5 

14 

5 

Male 

65 

793 

1264 

0.68 

0.21 

4 

4 

17 

0 
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Estimated propensity scores on tee logit scale for smokers a«l nonsmokers: Main effects ♦ stepwise selection ol interaction 
effects; B=Bias, R=Ratk> of smoker to nonsmoker varianccsili# lsplayed is the distribution of the ratio of variances in the 
76 covariates after adjusting for the propensity score it m A 




-,.... _ LQL6 9g 6I q 

http://legacy.library.ucsf.edu/tid/poqO7a00/prdfe: https://www.industrydocuments.ucsf.edu/docs/xygl0001 







Interaction Model 


Estimated propensityScores on the logit scale tor smokers and nonsmokers: All two way interaction effects model, B=Bias, 
R=Ratioot smoker to nonsmoker variances; also displayed ift (^distribution of the ratio of variances in the 304 covariales 
alter adjusting for the propensity score § g g 
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The list of variables used in the analysis in the main effects model: 


Race 



[Status 


r V 

S^fi^ported 
Health Status 


(raceothr) 

(midwest) 

(mtsseduc. hsgrad. collsome, collgrad) 
(lowinc. midinc, highinc) 

(sepdiv. widowed, nvrmarr) 

(DOMtelminsur. public) 

(averwght, sevwght, miswght) 
(lin^ijsbltrare. sbltsome) 

(FfiPk^ (poor), Rate 2 (fair). Rate 3 (good)) 


action vari 


PS^BC 


« \ 


hat entered in the stepwise selection: 


v p^g*age. PSsa&ge'pfieothr, v3=age*midwest, 
v1l=age*s€^^S^v154®S# n insur. v19«age'msblt, 
v^^f|e*sbltrare, v2 2aflaei pubiic. v23»age*ratel, 
v 2 ^?iteothrmidwes^^^=raceothr*misseduc. 
v29^^:eothr , collsorrm^b=raceothr*collgrad, 
v3l^^eothr*lowinc, ^^^receothr*highinc, 
vGT^gCeothr'privx, vfS’=rSt:eothr*overwgt. 
v4 3 a BB a rig eothr*sbltrar e ^nfii # midwest*nvrmarr. 
v7^^grad’highinc. v77=h sorad’prtvx. v30=hsgrad*sevwght, 
v ft^efeiag rad*rates. v9^ ^B seduC*widOWed. 
v94Q^seduc*nvrmaii^i8^=rnisseduc'uninsur l 
v97jM»usseduc’overwght. vl 01 *misseduc*sbttrare. 


v94^^seduc*nvrmai^si®8pmisseduc'uninsur. 
v97,«m(sseduc*overwght, vl 01 “misseduc*sbftrare 
vl 2$»soilsame*rate 1, vl 41 =collgrad*rate2, 
vl^^ollgrad'rateS, vl51*fowinc*msblt. 
vl ^llfb winc*sbltrare, v163*midinc*overwght. 
vl "^^®idinc*rate2. vl 80«highinc*miswght, 
v1SfPSSpdiv # sbltrare. v197*sepdvfratel. 
v200=widowed*privx. v204«widowed*miswght, 
v212*nvrmarr'privx. v214=nvrmarr*overwght, 
v216=nvrmarr*misswght, v219“nvrmarr*sblt®ome, 
v231 “privxVatel, v234*uninsur*overwght, 
v245=overwght'sbltrare. v254»sevwght"public, 
v2S5»sevwght'rate1, v276=sbltsome*rate2 
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Display 2: Some U.S. Data Sources Relevant to the Study of 
Consumption of Cigarettes as Related to Information and Laws. 


The <&ta sjts described below are apparently readily available, although not 
entirel^iiSependem. 



ata on per ca 
.S. states for 


.dft a as 


garette sales (in packs') annually by 
riod 1955-1985. as utilized in flj. 


Jfita base cons! 
Ata from 1949* 
Aand advertisi| 
price for cigarep 


tj g judv oTmt efffj 

*^l h odfc|55i thS 

^•I Zjl ata ‘sources d* 
-espies from 1963 
^Ihgaretres, pric^j 
Cjpd other inform 


ara used to st 
levision and 




id for use in [2], which has yearly 
|85 on U.S. brand consumption, 
icioeconomic indicators (average 
£DI, and per capita consumption). 

i 

jed in the Appendix of (3] in their 
if 1953 and 1964 health information 
|umprioQ of cigarettes over time. 

|ed in the Appendix of [4] for 46 U.S. 
1980 on per capita consumption of 
pack, per capita disposable income, 
p such as advertising expenditures. 

te effect of the 1971 ban on 
advertising described in (5], Table 3. 


ata in [6] including on consumption and prices for 

Iter-tipped cigarettes, for 1954-1982. 
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Display 3: Some Possible Foreign Data Sources Relevant to the Study of 
Consumption of Cigarettes as Related to Information and Laws. 


Data set provided in the Appendix of [7] from 22 countries 
|including the U.S.) from 1960-1986 on cigarette 
consumption, advertising, price. GDP. percent 
female workforce and percent manufactured cigarettes. 

Data displayed in Section 5 of [8] for 16 countries (non 
U.S.) in a srady^p ftfa e effect of tobacco advertising bans. 
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lyses in [9] for the effects of advertising 
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A STUDY OF THE EFFECTS OF IMPUTATION ON VARIANCE ESTIMAT£S| 

John Sommers, Agency for Health Care Policy and Research 
Executive Office Center, 2101 E. Jefferson Street, Suite 500, Rockville, MD 20852 




r ; 

H 


KEY WORDS: Imputation, variances, National Medical 
Expenditure Survey, Rao-Shao 

Background 

The next National Medical Expenditure 
Survey(NMES) is scheduled to be conducted in 1996 by 
the Agency for Health Care Policy and 
to the complexity of medical 
Id response rates for some 


Jttyfjli Research(AHCPR); 
LTJ'lL expenditure data, 
types of expend 
i^^^^hospital related 
F™™2^robJ«n in the last 
Ipf medical pi 

■ SurveyfMPS), was 

for the 1996 
''Also AHCPR onl: 
nonresponse in 1987, 


^ J* 



[uite low, especially for 
As a result of this 
1987, ft follow up survey 
the Medial, provider 
and is currently planned 
igcau and Ward, 1992). 
vider data to impute 
tation further limited the 


^Ineff ective response r dba^tfai fever. because of feats of 
P^jslor quality householdO^hen and Carlson,1994), the 
|$iifoousebol^B$i was ptti| for imputatioo(AHCPR, 

pAlrf 992 )- ]bsgf A 

f Ewtffvnurthis st^^^mtary survey response rates 

S^PMjn 1987 were very po b^otpak taHv for the three horoltal 
%Hjprelated expend ttur^^anergmey nxxnfEROM), 

M ^>utoatient(OPA~n and laai&CntfSTAZl. The combined 
^^ptem response ru«s|tlSittiia^d If either source was 
0 'accepted as a response, wcre lcss than 60% for two of 
Wsd the three expenditure types^OPAT and EROM. The 
p^jSTAZ rate was less i" 

The confoinatioto flMl low response rate, the 
* • -*• certainty of the qu qlttyfoffo e household data and the 

Wsal ost of the followf^^^y create many design 
^gg^ooestions. Among them are: 

What are the effects on 


® 1 ™ 

^gssggjdltl 


of using household 


for imputation and esti mat ion? 


Whst is the effect on esrimatet and variances of 
ifc^ii&ig only provider data for imputation? 

3 What are the effects on the estimates and their 
variances if less than a 100% follow up sample were 
conducted in order to reduce costs? 


In order to try to gun insight into these issues, 
AHCPR has been co n d u c t i n g empirical analyses on 


these questions. This paper reports some of our results 
and observations. 


We first discuss general methodology end then 
present a series of results. With each result we try to 
point towards peculiarities and issues, etc.. Finally, we 
try to bring together the pattern of results we have given 
and discuss a reason that could be examined further to 
help explain the results we have. 

Methodology 

In order to empirically assess these questions, 
AHCPR rebuputed and estimated totals and their 
variances for the 1987 expenditures for the 3 
expenditure types. This was done for six different 
scenarios intended to simulate one of the passible 
choices AHCPR could make cm these Issues. To create 
the scenarios we created for each type of expenditure, 
three different respondent sets. Each of the three sets 
includes all responses from household respondents. To 
these re sp onses we added provider response from three 
different subsamples of events, 100%, 75% and 50% 
subsamples. The combination of these subsaoples of 
provider responses with the household re sponse s yields 
the three sets of item responses and pop re sp o nse s. We 
impute the three sets of non respondents for each 
expenditure type, using two donor acts, 

1 all respondents for that expenditure type and 

2 only respondents for that expenditure from the 
provider survey. 


For each of the six combinations for each 
expenditure type, we perform sequential hot deck 
Imputation(Cox, 1980)) using the sa m e imputation edit. 
We calculate estimates of totals for tea different 
demographic, subsets based on age, race and sex^of the 
population and calculated their variances. We calculate 
variance Tftimwti standard balanced ic p eal ed 

repUretkmfBRR), Rao-Shao BRR adjusted for 
imputatlon(Rio and Shao, 1992) and Taylor Series 
Methods provided by the standard variance estim a tion 
package SUDAANfShah, 1981). 

Table A gives the size of foe response and ooo 
response subsets far each of the six scenarios for each 
expenditure type. 
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Table A 


Response Set Percentages 

Expenditure Donor MPS% Donors Respondent Non 

Type Set Non Donor Respondents 


TL . S TA 

STa 



OPAT 
, OPAT 
5&* agAT 

^Nopat 

^mPPAT 

wv 

nev 


All 

Provider 

All 

Provider 

All 

Provider 

li^P^Hder 

f%pfojrtder 


SmJ ^Provider 

iiLu 

cwwJie si2^WTfii ¥ iooor 


one can see the siz 
i*3§ never high and tew 
^Hcues. One would exj 
Vriences due to laek 
^4^ctretne amount of 1m 
0 potential estimation 
^Sggprovidcis give very dll 
sssWlhls would create esti 


ooor sets of respondents 
wy sparse for some the 
ttially large increases in 
ive sample size and the 
One would also expect 
if tho household and 
foes for die same events, 
different values. 


^i^ftble 6 contains the iwmerewerage relative values of 
j ^^ a yUj ices. calculated using the Rao-Sbao adjusted BRR 
Wgjsn cthnd. for the different expenditure groups. The 
m values are the average relative incnasefdecraase) in 
§£gg*hriince* forth* 10 demographic subsets for each of the 
jggg&cases compand to the All x 100% followup survey 
f an for that expenditure. Also given are relative values 
^^wPme foctor F for each cnee where 


r - . ™r. 

rd 


where 

nr is the proportion of the sample that 


.670 

0 

330 

340 

.130 

330 

.618 

0 

382 

AJ3 

.185 

382 

.571 

0 

.429 

330 

341 

.429 

375 

0 

.425 

.406 

.169 

.425 

347 

0 

.453 

332 

315 

.453 

320 

0 

.480 

363 

357 

.480 

329 

0 

.471 

342 

.187 

.471 

303 

0 

.497 

373 

328 

.497 

.478 

0 

322 

309 

369 

322 


respondents 

rd is the proportion of the sample which are respondents 
and serve as donors for imputation and 

rad is the proportion of the sample which are 
respondents and not used as donors for imputation. 

If the original sample were a simple random 
sample(SRS) with fixed non response we and size of 
the —i the imputation ha d an expected 

variance per unit of non response equal to the SRS 
sampling variance then an approximate variance for an 
triads made with the imputed data set would be a 
c on s ta nt times F. Wo foci that use of the relative 
increases and comparisons to the factor F helps us 
normalize the rasohs by considering different unit 
variances in the sets and different uqw response rates. 

Comparison of file increases in average variances 
with fite increa ses fat F points us to two possible issues. 

1 WhUe we realize that F is a very simple measure, 
we note that in g en er al the values of Increases in 
v aria nc es generally fnouase with F. However, of 
particular interest was the foct for the first two sets of 
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Per Cent of Variance Due to imputation 









V 


t 



the earlier patterns shown in the per cent of variance 
due to imputation. This value varies by expenditure 
type. However, because of correlations between 
replicate estimates and the sire of the differential 
adjustment made to provide for the effects of 
imputation, these differences are different than the 
estimates of the effects of imputation on the 
varienccs(Sonunen, 1994). 

Of some interest are the estimates made using the 
SUDAAN Taylor Series results. They are less 
predictable than the standard BRR estimates which are 
almost always lower than the adjusted variance 


estimates. T» 
higher than the 
estimates, la 
sample variances, 
the Taylor Seri 
attributed to the 
stratification adj 



cries estimates many times are 
Rao-Shao Adjusted BRR 
When woritfng with the full 

in variance estimates of 
the standard BRR was 
the BRR considered pest 
le die Taylor series did not 



ius tmeD tsjjhl! 

However, since tip wmTwe have also calculated the 
newer adjusted Mafemf linates for the variety of 
expenditure and n onrespon se sets. Obviously, this was 


fw-o*. vnpvuwttwtv ntw *as 

MP*Xa false assumption 

~ iua Wa. M tt 


es we have. i Uualcant tha 
.o ral as wjolljMWfoss 

aimither * rifijen i of v 



1 upon die small sample of 
that Taylor Series est i mates 
the differnet respondent 
of variances. They seem not 
effective sample size, Of 
this, since Taylor Series 
so I nuM t a tion h ff 1 *^ 





lues of estimates of total 
three expenditure types for 
Values ate taken relative to 
the All x 100% erne for 


Table E sh< 
expenditures for 
each of the six 
the value of the 
the same expend 


Since, we can only analyze 3 data sets, we fed our 
stable. This stability is r ep r esenta tive of the results for 
all demographic sets we exnumo d, even those with less 
than 20% of the total population as donors. If one 
assumed that the only wiance was due to the 
imputation and used the velum in Table C as tbe 
variance of the estimate given the sample. One could 
not reject the hypothesis that each of the estimates 
within a given expenditure type had a common mean 
value for tbe sample. Ihb is reassuring from an 
estimation prospect i ve. Previous «mlyses(Cohc& and 
Carslson, 1994) where paired comparisons were done 
for events whh both h oc trhoM and provider r es po nse s, 
showed no statistically significant differences in the 


means of the two sets of responses. However, the 
variances were rather large and avenge relative 
differences varied from 0 to 10%. Our results on means 
seem to indicate that -this relative difference does not 
transfer to die estimates of totals. 


Analyses 

Since, we can only analyse 3 data sets, we feel our 
information for analysis is sparse. While there are 
probably many reasons for variance increases, the 
percent of variance due to imputation and the stability 
of estimates, we focus on a single possibility in our 
analysis*. Schaefer, Khare and £zzitHRiee(1993), when 
they obtained a percent of variance due to imputation 
less than expected, proposed that the reason was that 
there was less information lost (ban the amount of data 
lost due to item non response. Following this‘advice’ 
we ran regressions for tbe three donor sets using tbe 
reported expenditure as the dependent variable and the 
cells used for imputation as the independent variables. 

Table F show a small number of illustrative results. 
The table shows the R 1 value for 2 regessions run on the 
100% samples for each type of expenditure. One is on 
the set of donors which incl ud es all respondents. The 
other use only the set of provider donors for the 
regression. The other value shown is the ratio of tbe 
mean square error for the two regressions, Le, the ratio 
of tbe provider regression MSE to foe AU regression 
MSE. 

One immediately notices the d i f fer e n c e between 
EROM and STAZ. If the value of the R* is used as an 
indicator of tbe information availabe about the size of 
an expenditure if one has the characteristics of foe 
medical vishflf R 1 were l we could predict the cost 
with no error.), then we have the most in f or mati on for 
STAZ and the bast for EROM, with OFAT betwe e n. 
This relates well to most of what has been pr e se n te d. 
The STAZ had the least increases in variances. The 
order of average percent of variances due to imputation 
was from smallest to largest, was STAZ, OFAT and 
EROM. STAZ had the most stable estimates of totals. 
One could attribute foe stability to better imputed 
values. One could even attribute the smaller variances 
for STAZ when tbe provider responses only are used for 
imputation to two points, (1) the improved R* and (2) 
the smaller variance within the data set as implied by 
the reduction in MSE beyond tint which woold occur 
just due to a better R*. 
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Table E 

Relative Estimates of Totals by Donor and Respondent Sets 


Expenditure 

Type 

f \ STAZ 
W STAZ 

H STAZ 
STAZ 
STAZ 
STAZ 

EROM 
EROM 
- EROM 
■ pEROM 
EROM 
EROM 




OPAT 
OPAT 
?PAT 

3pat 
OPAT 
p§i*!SS» OPAT 


w. OF 


^5 

4) 


fVl 

NN 

gig 

pp 

P. 

! 


Donor 

Set 

MPS% 


Estimate 

All 

100 


1.0000 

Provider 

100 


.9997 

Ail 

75 


1.0009 

Provider 

75 


.9914 

Ail 

50 


.9783 

Provider 

50 


1.0007 

All 

100 


1.0000 

Provider 

too 


1.0201 

All 

75 


1.0020 

Provider 

75 


1.0116 

All 

50 


1.0266 

Provider 

50 


1.0354 

All 

100 


1.0000 

Provider 

100 


1.0231 

Aii 

75 


.9807 

Provider 

75 


1.0364 

AU 

SO 


.9747 

Provider 

50 


I.01S2 


Table F 


ndlture 

A11 Set R* 

Provider Set R 2 

MSE Ratio 

EROM 

.036 

.034 ■ 

1.14 

OPAT 

.117 

.149 

1.06 

STAZ 

.175 

.216 

M 


To further vejtHpiw effect of the quality of 
imputation as a feet^^^^ reautts, we reimputed the 
—data acts using onhniiw cal for imputation. As we 
would have hypot hnsabd^ be means for STAZ and 
^gf OPAT changed ibffTO, signifying the loss of 
Information. For EROM with the poor reg re ssio n, fie 
meens changed about 3%. Likewise, the average 
f* percent of variance doe to imputation almost tripled for 

STAZ using the single cell, about doubled for OPAT 
i and moved only 3% for EROM. 



Our results indicate to AHCPR that several steps 
could be useful in helping to determine future 
imputation results.' 


t AHCPR should work toward development of more 
precise relationships between i n d i c ator s of: quality of (references upon request) 


i mp u ta tion and final variance results. Such relationships 
could be developed either through t heor etical means or 
lacking this, through use of more controlled simulation, 
studies, 

2 AHCPR must do more careful analyses of both 
potential erpertad values and indicators of quality of 
Imputation in the fixture, in order to make imputation 
dec isi o ns which lead to es timat or s which yield lowest 
MSE results for the money available. 

3 AHCPR should use Taylor Series for variance 
estimates with caution for data sets with large amounts 
of imputations. 

4 AHCPR needs to detennine the effects of other types 
of imputation on variances. 


tr:>i 


"MAGE 
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and other government survey* we felt the potential 
underestimation of variance® was an important research 
issue. The awn rec e n t m eth o d s of Kao and Shao were 
chosen because they appeared simpler and less 
expensive. 

Particular Problem 

Currently, variances for NMES data are calculated 
using Taylor Series (Cohen, DiOaetano, and Waksbmg, 
1991) This is the simplest method for AHCPR to 
Le variances. It only requires a single imputation 
a tingle set of weights. Multiple imputation would 
imputation. Use of replicate 
which are available. 
Vie Files and calculation of 
S|V replicate also 

manta a« I w pai nt far 

numerous imputation acts 
Ion within each data 
to perform these new 

warnplay p r oces s , 

lied by Rao and Shao, 
imputation cell and thus 
points an imputed, 
NMES PUFs (AHCPR, 
convexities, AHCPR has 
to provide information to 



require more 

.requires 
not on NMES 




to 

replicate, 
numerous cells 
setting 19 the 
replication methods 
not st all the simple 

1 , who only 


vT dmy seq uin k now! 
l^^whicii is Siwitly s 

P*®1992). c 



1 h 

HQl 

' ..i’ll 


ip guweits 
'or NMES. 

To start this 
itnres for 
iuhpopulations were 
’lor Series, wfal> 
sew adjusted BRR 
was dona for 
Inpatient 
Physician office 
These wen 
imputation rates, 


flJ aCpeadi 

^sS^suhpop 

f jr»y 

^^sew 





variances for total 
population and several 
using standard BRR and 
idler wij ^h* 

consider* imputation. 
r of expenditures: 

AZ) and 


of the diff eren ces fat 
of expenditures 
percent of tite population with such cqtadtem. 
For STAZ* there an approximately 36% of tite 



if the population have nth an expenditure. Bor MVIS 
approximately 28% of the visits an imputed and about 
have such sa exp e n ditu re. Bven with the smaller 
imputed from MVIS, formula (1) would 
indicate a potential underestimate of 40% for NMES 
variances. 

Imputation for NMES I* *t tits svent lord (hospital 
stay, officevisit) when weighted sequential hot decking 
is dona for a number of imputation cells. CeQa war* 

by wwiVhlin «■»«»«« Miated «*> nyi *nA 

those which proved to be of im port an ce in prediction 


model*(Sec AHCPR, PUF's 14.4 and X4.5, 1992). 

Letting ^ be the value of the jlh imputed event in 

the ith imputation eeQ and y s be the value of jth donor 
event in the ith imputation cell and w 4 the weight for 
the ijth event in the ith replicate, then the adjusted 
expenditure estimate for the rth repKcate(Thete are 76' 
replicate* for NMES.) ' 


^-ee*w>,+ 

EE w» b, * (*„ - 

• A*, 


where 



r “ 


0, 1, 2, .......76 


(4) 


(5) 


where D, is the ith donor cell, 

S) is the ith recipient cell, 

r is the replicate number, with replicate 0 the foil 
■ampl e. 


This formula can be obtained by applying 
techniques similar to thoae used in Rao and Shao (1992, 
p. 817) to half samples- The estimate for the variance 
of the total expenditures, »<—uniform resp onse 
mechanisms, la 


7« 

n«E 


M 


76 


. <«> 


The unadjusted estimator for replicate r can be written 
as 


y ,«EE w ** + EE 

< Ah, t A«, 

sad the unadjusted estimate for variance is 

v.f 


(7) 


(B) 


] BEST IMAGE j 

'"* H T ” —-— -r imiinmmiin 
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Var/lO** 


ABKR 

V«t/10“ 


ABKR 

SHSUDAAN 


lair! 


Male* 65+ 


Other Males 65+ 


Other 1S-45 




As cm be sees from the table*, the difference* imputation and the implied increase hi variances from 
between tbe adjusted and una^oated BRR actuation of aquation (1). Uteltylar Sanaa cdsalM ace foaemlly 
variance are surprisingly small |i w a die levels of falfto i but as is demonstrated in Table B, sometimes 


BEST IMAGE 
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THE COURT: Counsel. 

MR. BIERSTEKER: Thank you. Good morning*', Professor Rubin. 

THE CLERK: Excuse me, counsel. 

(Witness sworn.) 

THE CLERK: Please state your name and spell your last name. 

^THE\wiTKESS: My name is Donald Rubin, R-u-b-i-n. 
fcLERK: Thank you, please have a seat. 

ITALD B. RUBIN called as a witness, being first duly sworn, was examined 
pfied as follows: 

SIERSTEKER: 

Jave you ever testified in court before, professor? 
io, I have n^t. 

j,et' s start w iM % J some background information about you. Where do you 



live in [DELETED] 

Pre you iuarr Lel?J 

res. r“™S 

E)o you have 
res, I do. 

Q. How many chiS i do you have? 
fcggsL have two. 
fTMuld you re 
high^^ool.^N 

pifeMsince P$^vin«gj|Evanston High School, I went to Princeton University, wher* 
I got a baq&f&n#' s Sadje e, then I went to Harvard University, where I got a 
ma a&fcRilfc of science^atcna Ph.D. in statistics. 



nildren? 


for the jury, please, your educational background since 



here are y 
m current 
nt of stat 
Were you 

es. I was 
ave you ta 
es, I have 
ty of Minn 
or what pr 



rently employed? 

yed at Harvard University, where I'm a professor in 
chairman of the deoartment of statistics at 




ftotal of nine years in — in three three-year terms, 
statistics at universities other than Harvard? 
taught at Princeton University, I’ve taught at the 
I've taught at the University of Chicago. 
iOnal associations have you been elected a fellow? 

S here are several: the American Statistical Association; the Institute 
ematical Statistics; the American Association for the Advancement of 
Scie nce j the American Academy of Arts and Sciences, which is really only an 
honoSJrlLc society, so everybody is a fellow effectively; the International 
Stat|i^|j.cal Institute. 

— what is a fellow of professional organizations? 

A. A fellow is typically regarded as one of the more senior members of an 
association, and the way they typically are chosen is there's a committee of 
more senior members of the association who review nominations from the members 
and then select the people they think are — are worthy of that designation for 
their accomplishments in the field. 

Q. Very briefly I'd like to discuss a couple of the organizations you 
mentioned. One is the American Statistical Association. 

Copr. © West 1998 No Claim to Orig. U.S. Govt. Works 
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A. The American Statistical Association has about, I think, 20,000 members. 

I think it's the largest statistical organization in the world. It is the 
statistical organization for the United States. 

Q. How about the — was it Institute of Mathematical Statistics? 

A. Right. 

cfc ’s a smaller organization that is international, although most members 
^orth America, and it's more academically oriented than the American 
ical Association, so it's — it’s primarily mathematical statisticians 
at various universities. 

hat about the American Association for the Advancement of Science, 
hat? 

large' organization. I don't know how many members it has 
ly a hundred thousand or more, and it includes all 
sics, work in environment, genetics, biology. Statistics 



That's a ve 
but it’s 
of scienc 
.1 piece o 
le jury ha 
ask you soi 
:ic publicat 
The chief e 
ultimate responsib 
revife vML that have 
of wNith^r the^fgv 
S^IHave be 

A£pel^ion?pM| 

A. Yes .tft iasswe s 




Am 




Statistic 
id you als 
erican Stat: 
Yes. At 
section of 



d about reviewers for peer- reviewed publications. I 
g a little bit different. What is a manager of a 

the editor for a scientific publication, has the 
for accepting or rejecting a submitted article based cn 
brained from outside reviewers and the editor’s judgmer.: 
uggest that the article is worthy of publication, 
editor for the Journal of the American Statistical 

itor for the applications section of the Journal of the 
ociation. 

editorial responsibilities beyond that for the Journal 
cal Association? 

other times I was the associate editor for the theory 
Journal of the American Statistical Association, as 


when I was fH^^epplications editor, I was also coordinating editor, 
ans I coorc±Lcy|sJd book parts of the journal and represented it to the 
office. 

lave you edi|te«|^>r been an editor for other professional peer-reviewed 
:ions? LTI] 

lave I been wfP@|sociate editor for other — other journals? Yes. 
Bionyi^feica. I've been associated for — associate editor for Sinica, which is a 
stata^&sacs journal that's — primarily is published in — in Asia. I’m an 
associate editor for Psychiatrist Medicine, I believe. I'm also an associate 
editjli^r I have been for Brain and Behavioral Sciences. And there are probably 
one gj^fcwo others. I don’t — 

fejfe A^ ell. I think Journal of Education Statistics. There may be several 
others as well. 

Q. Beyond your editorial responsibilities, have you been entrusted with 
responsibilities by your peers in the American Statistical Association? 

A. Yes. At one point I was on the board of directors of the association. 
I’ve been on various committees for the association over the years. 

Q. Did you serve on the American Statistical Association's Census Blue 
Ribbon Panel? • 
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A. Yes. That was a recent committee I served on. 

q. I've read in the paper news stories about undercounts of the poor ar.d the 
minorities in'the census. Was that an issue that the committee upon which you 
served addressed? 

A. Yes, it was. The formation of this Blue Ribbon Panel was to provide 
adv®Ce to the use of sampling in the dicennial census as opposed to trying to do 
a head cpunt in the sense of saying to what extent was it possible that the use 
of ^S|$fing would provide more accurate estimates of the head — of the counts 
in ftte United States, especially of — of minority people who are typically 
und4^Stoted. 

Lq. Did you serve on the American Statistical Association's section of survey 
resgs^r^ methods? 

now I'm chairman of the association's section or. 


sur 



s. In fact 
search met 
at are sur 
rvey resea 
urvey data,j 
5pfessional s 


A. 

col 
large 
NME 

Q. Could you de 
American Statistica 
of the 

the sjtaasv evs; &am ex 
Natijto Zff Cent aufrS for 
are' people 
Nat ional OpinTto 
thenfffffi 




search methods? 

thods are the techniques, the methods that are used to 
articular survey data'in large government surveys or 
s of the type that arose in this case, like NKANES ar.d 


section on survey research in 


:ne 


the members of the 
ciation? 

s are fellow employees who are involved in conducting 
at the Census Bureau or at the Federal Reserve or 
h Statistics, those kinds of agencies, and many cf them 

. mpanies who actually conduct major surveys like 

RejlPSlPlf} Center, which is in Chicago, or Westat, which is in 
ngton area. 



one sort df committee for the American Statistical 
— gives out an award for outstanding statistical 


CrWfid you serve 
Asso ciat ion that ga\£ 
appli^Kions? 

Yes. A fev^yeaurs ago I served on a committee — I don’t remember the 
exac tmf me, but the ^e a^j b about four or five committee members, and the 
cormJWwe was chargf*T i 'iSxh the responsibility of - - of choosing among submitted 
applJES^^ions that ~~ to be chosen of what was the most outstanding 

appjjtete(8K|ion. It was fafway of encouraging good applied work in the field of 
stat lty cs. 

Q ffis& ow you talked about statistical applications before when you discussed 
vourIftAftorial responsibilities for the Journal of the American Statistical 
Association and now again with regard to this committee that you serve on where 
you gg&Mj an award. Could you provide the jury with an example of a statistical 
applta££ion. 

Vk^.1 certainly the plaintiffs’ model in this case is a — is a — is a 
majofapplication. It's an application of statistics to try to address a 
problem. 

Q. What is the Samuel S. Wilkes Memorial Medal of the American Statistical 
Association? 

A. That's an award that the association gives to a member once — once a 
year for lifetime accomplishments. 

Q. Have you ever received that medal? 
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Yes, I did, several years ago. 

What is the National Academy of Sciences? 

The National Academy of Sciences is an organization that was set up more 
century ago, I believe, to provide advice on scientific issues to the 
government. 

ave you served on any of the committees of the National Academy of 
7 

Yes, I have. 

id you serve as the chairperson of the theoretical subpanel of the cere 
piece data? 
es, I did. 

hat is incomplete data? 

ncomplete ci at^^ efers to the problem of missing data. In other words, i 
no missing pfltav the data set would be complete. And because of missing 
he data sep^s^ften called an incomplete data set. So this was dealing 
nel was set up to deal with issues of incomplete data i 
surveys. 

incomplete data produce any published work? 
hree volumes that were published, there was a 
e was a case studies volume, and there was a proceeding 
1 put together a conference, and the papers that were 
e conference were compiled and put into a third volume, 
that was put in one of the volumes, and I don't 
— which of the three, that was advice on dealing with 
ata, primarily federal — large federal surveys, 
ou published books and peer- reviewed articles in the 


problem 
irily in 
)id the pane 
fes. There 
theoretical volume, 
voiuK^^Eecause th 
submittOT andread 
Alsol^l^ere w^PSW s 
rerpS&ffiMr exa|M^ w 
problems o%p®«^mp 
P rofessor, 
fi£l & nk statistic 
fes, 1 have 
approximate 
have appr 

book^ 

^A re there a 
you na vlp special e 
think I'm 
mij5TSjrf“re| data, non- 
in ewpSrriments and 




many? 

ely, I don' 


a cozen 


know, 230 articles, and maybe hal: 
within the field of statistics on which vou believe 


ics 
ise? 

nized especially for my contributions to problems of 
se in surveys, and also to problems of causal inference 
_articular, in observational studies. 

Let's turn to the last area first, causal — causal issues. What is 
causfc^'inference in statistics? 

|L.Causal inference is — is trying to assess the effect of an exposure 
versffinno exposure on some outcomes. So an example could be in — in 
expflgffiSlents, couple of the examples concern drug development where, before drug 
are pSpi^ved at FDA, for instance, they have to go through a randomized 
experiment where people are divided in half randomly and some get the drug and 
some typically get a placebo, and then if — if it looks like the drug — the 
new drug is doing better than the control drug, then it's approved and it's — 
and it's marketed. If it doesn't look like it's doing better, then FDA turns it 
down. That would be an example of causal inference in a randomized experiment. 

The difference in -- in an observational study is that you don't get to do 
the randomization, so you — yo\i don't know that the groups who have been 
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exposed and unexposed start out in a comparable way. They typically don’t ir. ar. 
observational study. But you're still trying to learn about the effect of 
exposure. 

Q. Are the data sets being employed in this case observational or 
experimental? 

A. They're observational. 

Q. Have you taught courses on the subject of statistics and causal 
inference? 

Yes, I have. 

Q| This past year did you teach a graduate course at Harvard entitled 
il Inferences in Statistics: Social and Biomedical Sciences?*' 

A. Yes, I did.~' - 

Did you teach courses on causal inferences when you were at Princeton 
livefsity? 

Yes, I did.. 

Did you havlfN||y| students in your courses with whom the jury might be 

I think so / One student was Scott Zeger, who — who took the course that 
it at that tluR^-; 1 
Was this a pa&aj& e in statistics? 

Yes, it was r J,n Jhe department of statistics. 

degree Professor Zeger has? 
his Ph.D. is in statistics in the — from Princeton 

|of statistical literature on statistics and causal 

uite substantial, 
t literature been around? 

literature on drawing causal inferences from experiments 
ing of the century with the advent of the common use of 
The formal literature on observational studies anc how 
ably more recent, but goes back 30 or 40 years, 
on this particular subject? 





Do you kno 
A. I'm almost 
Unjijjersity. 

|Is there a 
rntarenee? 

Yes, Sir re 

A, Wefl^Se f 
fap^fcack to the 
ra^jg^ired experi 
!0*$gc3| them well is 
Have you w 
A]| Yes, I hav, 

For how long ha$e you written in peer-reviewed journals on this subject? 
About 25 oipUfl^ears. 

Have you r 4gJ^I^y published articles on this subject? 

Yes, I hav#^^ 

Have you w fe 4 $lfc<»fe on the subject of something called propensity scores? 
Yes, I havtf!”^ 

What are propensity scores? 

The idea of propensity scores is most applicable in observational studies 
an exposed and an unexposed group don't start out comparable with respect 
kground factors as they would in the randomized experiment, and the idea 
jgensity scores is a technique to help you compare exposed and unexposed 
e similar in the background. In other words, it's a technique, collection 
of techniques perhaps, that's designed to — to compare like with like, to 
compare exposed and unexposed who are — who are similar with respect to these 
potential confounding variables.', 




Si 
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*8 Q. Have propensity scoring techniques been used in addressing medical 
issues in the literature? 

A. Yes, they have. 

Q. Could you provide the jury with some examples of the use of propensity 
scores as have appeared in the scientific literature to address medical issues? 

”A.\Sure. One example from several years ago is a GAO — that’s Government 
Ac 5unt|ing Office, a federal agency -- study of the treatment of breast cancer 
in Lilian women, comparing mastectomy versus breast conservation therapy. And 

wanted to do this from an observational database, not randomized database, 
SEER, which is Surveillance Epidemiology End Results, that collects 
irmation on a variety of things, but also women who have breast cancer, and 
► on they waited to do t-hat in the observational database is they wanted 

rom the observational database with the results cf very 
ents that had been done on the same treatments for 
rn was that the type of doctors and women who will go 
nts for breast cancer treatment, to have a toss of s 

women ar.d 



ire the res 
mdomi 2 ed e 
:ancer. The 
idomized ex 
decide wha 
in general 
rfho actuall 
kind of people who 
aecl hsirsns about wh 
|5mei|| we looked 
the:i|«|octor§W#ioo 

CO 

difference 

rs that ha 
ho chose to 
with the w 
as a need t 
pensity sc 
und variabl 
ave you us 




tment you’re going to get, are not typical 
ice, whereas the observational data set, the data set r; 

breast cancer and get treated, is representative cf the 
— who are getting breast cancer and have to make 
atment to take. 

e — at the groups of — of women who choose to — with 
have mastectomy versus choose to have breast 

There are just 

believe, there are 

country, there are differences in — in a variety 
e controlled for if we're comparing like with like, 
th their doctors chose to have breast conservation 
who chose to have, with their doctors, mastectomy. So 
C0ff|jare like with like, and the Government Accounting Office 
techniques to do the adjustment for the collection of 

opensity scoring techniques in the published literature. 




Ina, I guessrtiele in the Journal of the American Medical 
|tion? 'LA ''l 

fes, I did. pPsill|Lnvolved in a long-term study on the effects of prenatal 
expqa$££e to hormoes and barbiturates. And there was a study that we did using a 
Dan jjtW cohort on barbiturate exposure in utero, and that work was published by 
me v|ith several other authors in the Journal of the American Medical Associatior 
mayiftaifne or two years ago. 

j@ii|Are propensity scoring techniques used only in medicine, examining 
med i&pLfl .ssues? 

iRTwo. They're — they're used in other fields as well. 

Q. Is there an example from — one example from another — another field 
that you can offer to the jury? 

*9 A. In economics it’s -- it's been used to compare the — to assess the 
effects of job training programs where you have people who volunteer for job 
training programs and some people don't volunteer, and you'd like to know the 
effect. And this is a job training program, so your exposure is are you job 
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training or not job training. It’s been used in education to try to compare 
different educational programs and see which are most effective. 

Q. You mentioned, doctor, something about comparing like to like. Do 
statisticians try to compare like to like? 

A. When they're doing causal inference, when they're trying to assess the’ 
effect*, of an exposure, where they're trying to say what the effect of this 
e^osu|e is, it's essential that you try to isolate the effect of that exposure, 
way you do that is you compare people who are exposed and unexposed who 
are- similar with respect to this background characteristics — these background 
eh^^^eristics. So it's essential in that context to compare like with like, 
thfexposed and unexposed who are similar in background characteristics. 

^yj^gWhat happens when you compare two groups, such as smokers and r.on- 
ssfflK, but you dofo it c ompare like to like? 

PHtafWell then wjygjPffpu look at an estimate of a — of an effect between the 
extaasygri/unexposed <frouD, you don't know whether that’s due to the exposure or 
duTO^he thing tifa%»Siffers in the two groups, because they're not alike. 

~ Q.Jp o propensil W&aidb res help you to determine whether two groups, such as 
smpW and non-smokers! are different with respect to a collection of factors; 
fop^mple, in thi^ijglSse, seatbelt use and income, that you might want to take 
into account when doing a comparison? 

~ ' T 1 effective way of assessing in a quantitative way how far 
exposed groups are with respect to this collection of 


p$!*i$ijWe 11 then w] 
exiao sstfi/ unexposed i 
duiPtWShe t hing t] 
_P-Jdo propensi 
srnJM and non-sm< 
fo^MS^mple, in th 
into account when ; 

Yes. It’s a 
apaKlSlfc’he exposed 


mmn 


O 


ba c^^ und -^g^abl 

Did yof Sroew 

inPjS®H»t ion? » 

%#Yes, I did. <5 
j®. \Did you reyie^ 
depdipfions? p™ 

|&. \Yes, I did. ^ 
^®^Did you review 
p^miYes, I did. |M 
^VBased upon 


u to briefly summarize — I'll go over with you some of 
at here in this case. 

laintiffs' models and their reports and their computer 


le testimony of Drs. Zeger, Wyant and Miller in their 


e trial testimony of Drs. Zeger and Wyant? 


sion that th« 
ikers differ 
i income that 


ffesd fBased upon i&yuiypeview what the plaintiffs did, was there any 
i ndicaa ion that th^^^^Lntiffs analyzed their data to see how much smokers and 
non^gmSkers differ respect to the collection of factors such as seatbelt 

use Ipra income thatthey've put into .their model? 
dvfjl saw no evidence of such a comparison. 

jlPF^Did you compare smokers and non-smokers in the plaintiffs’ data set 
voupsalf to see how they differ with respect to the collection of factors that 
theJoTiintiffs decided to use? 
jjwrfflYes, I did. 

^Wlbw did you conduct that investigation to see whether or not the smokers 
and non-smokers differed with respect to those characteristics? 

*10 A. I used the propensity score method in each of the six age by sex 
cells that the plaintiffs used, so 1 looked within each of those cells to see 
how far apart the smokers and non-smokers were with respect to this collection 
of background variables that the plaintiffs' models tried to adjust for. 

Q. What data set did you use to do that analysis? 
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A. This was NMES, the National Medical Examination Survey. 

Q. What were the actual measurements that you made in the NMES date set when 
you looked at smokers and non-smokers? "'*■ 

A. I’m going to use a little bit of technical jargon, not because I -- I 
think it will convey much, but in order to answer that question I just have to" 
say"—Xise some technical jargon. I — I examined the number of standard 
de flatio ns between the means of smokers and non-smokers on this thing called the 
prcP®§!^ty score; I compared the ratio of — of variances of the propensity 
sccf^e ratio between smokers and non-smokers; and I also compared the ratio cf 
vaJSilii^ps of the original background variables after adjusting for the 
propensity score between smokers and non-smokers, 

'fhy, of the-§gll th& measurements one could make, did you happen to choose 



hmsmz.j 

ere 



ree measur 
chose tho 
smokers wi 
statistic 
ate how mu 
efore, how 
to adjust 


th 

an 
th 
to 
an 
me 

the aiready-establi 
do s bfe h adjustment 
mention 
wrotj^i arti®^e£ o 
s AssJ rhat * LfacPrr 
^Qlwho ^ ajO fil 
AjJHilliScoc 
at^!|i?vkrd. And he 
the liifhors of the 
doctor, wha 
» plaintiff 
viewed lite 
>Jell I foun 
it with res; 
:fs tried t 
MS^&nd what is 
plai^^ffs' data s 
they 





ee ways of measuring the differences between the smokers 
pect to this collection of background variables because 
erature that’s — that uses those kinds of measurements 
uble you might be in — how far apart these groups are, 
trouble you might be in using methods — different 
ose differences. So it's a way of calibrating back to 
statistical literature which provides guidance on how to 
bservational studies, 

statistical literature providing guidance. Y C ii yourself 
many years ago with William Cochran; is that right? 

ochran? 

as my thesis advisor at Harvard, my Ph.D. thesis adviser 
very distinguished professor who also served as one of 
nal Surgeon General's report in 1964. 

you find when you made these measurements from the data 
and compared it to the standards that exist in the 
7 

the smokers and non-smokers were substantially 
o this collection of background variables that the 
st for in their models. 

onsequence when the smokers and non- smokers in the 
ffered substantially with respect to the factors that 
elude in their models? 






r^pgeluae rn tneir models/ 

^jirhe kinds of models that the plaintiffs used, which are linear models or 
log Mnear models, logistic models, they have assumptions, and what this 
litepia^ure shows is that those kinds of models just cannot be trusted to provide 
reli abl e adjustments. It just doesn't — doesn't work in general. They're not 
reliJg&|L 

nTI. And why is that? 

A. It's because the groups, when they're far apart, don't overlap very much, 
and all these models have assumptions built into them that involve extrapolating 
straight lines, or curves that get broken down into straight lines, into regions 
where the data are very thin in the other group, and the literature shows that 
those kinds of models cannot be trusted to do the adjustment for the collection 
of background variables that are supposedly in the model. But being in the model 
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le way to adjust for public aid status in the da; 
•> 


sets 


at is that analysis? 


doesn't mean they’re being adjusted for. 

C. Professor, do you have an opinion about whether or not the plaintiffs' 
model reliably compares like to like even with respect to the factors that they 
included in the models? 

A. Yes, I have an opinion on that. 

"find what is that opinion? 

^ A. J 'hat it does not. 

^fiiWas public aid status one of the factors that plaintiffs included in 
th ^Tr mo dels? 

it was, 

ILq. Does plaintiffs’ model reliably adjust for public aid status? 
o, it does pot. 
s there a r ! 
thdT™TTl% plaintiffs 
believe th 
'Q. ^frid what is 
'ell I — 

HAMLIN; Obj 
' s report 
techniques that wer 
was brjt icire. 

jp^T^IERSTEKER: 
not in |&s^rep| 

be' : ;,>, king l ayo ut 

The cour£I|ii 

^It woCT^ha 
— variable, 

privs^r sector. Hop' 
smok|?U| would not . 
backgflnind variab-1 

C^^D oes plainti ffs * j model compare like to 1 

m 

. It cann 

adj| .^ 

area in statistics for which 
problem of missing data. For how long 

been a subject of interest to you? 





n, Your Honor, I believe this goes beyond the scope of 
on't believe that he specifically talked about any 
posed to be used in the model. All the expert report die 



not sure that's right. Your Honor. This was addressed 
ut also in the depositions, the very techniques we've 
this morning. 

. I'll allow the answer. 

n much better, especially given the importance of that 
e done a separate analysis for the public aid and the 
y those — within those two groups, the smokers and ncr.- 
een as dissimilar with respect to the collection of 
plaintiffs were trying to adjust for as they were 


.ke for factors such as exercise 


they're not in the model, it can't even pretend to 


■mmmiEoT them. 

professor, IpSt^ke to turn to the second 
u b|^re special expertise, and that’s the probl 



Twenty-five, 30 years. 

tad we touched on this briefly before, but maybe you could elaborate a 
)it further. What is missing data? 

^ssing data, as It's usually used, refers to the fact that when you try 
iata collection, a survey or an experiment, it's not uncommon to not be 
able to collect some values that you wanted to collect. For example, in a large 
survey you often find what's called unit non-responses. Some people who you send 
a questionnaire to or ask to respond don't respond; they don't send the 
questionnaire back. That’s called unit — a person is a unit — it's called unit 
non-response. 

*12 And the second major kind of missing data is item non- response, which 
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refers to someone who sends a questionnaire back in but leaves certain items 
out; for example, income items or health expenditure items. Although the person 
fills in other things, gives perhaps sex and education, but doesn't want to tell 
you income. 

s there a body of literature in the professional field of statistics 
cusses how to address these kinds of missing data problems? 
es, there is. 

s that an extensive literature? 

es, it's — it's quite extensive, especially in the last 20 years, 
part from the three volumes that you prepared along with others in your 
the National Academy of Sciences, are there any leading books or 
on the proM^m 4 of'missing data? 

standard reference for missing data problems is still 
ro£e with Rod Little, it's called — in 1987, published in 
ic?al Analysis With Missing Data." There's another book that 
ed in the same year on a technique that I developed 



think that 
book that I 
led "Stati 


I wroteTthat was pui 
caJiffe^wult ipie imp! 

'^OTProfessor, tl 
Whatsis imputation?! 

gfeaActuallv the^$ 
for ion-fesponsip in f S 
"SamlSPtl" Buf^fljie ipp 
inPSPaf wePWIPye 
wish the ga 

fiin, and th 
noth^y there. 

y Q» ijiave you cons 
data CL f surveys? 

Yes, I have.* 
ban you pro 
j ^W Sure . For S 
the ^^pLonal Cente 
NH AilS National H 
Dej W ent of Tran 
AdmimPlftration on 

X'O'r. 




on, which I think is also a fairly standard reference 

st book you mentioned had "imputation" in the title. 

title of the book, I think, was "Multiple Imputation 
le Surveys" — or maybe in "Surveys" without the 
tion refers to — it's just a technical word for filling 
ank somewhere, you say, gee, I really wish I had that, I 
that value of income, you take something somewhere and 
ailed imputation. You impute a value where there’s 

with the federal government on issues about missing 




he jury with a few examples. 

years now, I don't know, eight maybe, I've worked with 
Health Statistics on their missing data problems in 
And Nutritional Evaluation Survey. I've worked with the 
tion for the National Highway Traffic Safety 

data set, which is a Fatal Accident Reporting System, 
ked with the Census Bureau over many years with problems of missing data 
various of their public use files and still — still doing things with 


the l^nsus Bureau. I've done things that — in the past with Internal Revenue 
Ser^S and some of their research databases. What else? There are other 
age^l|s as well, but — if I thought more, I'd probably be able to pick them 
up. pSffi^b^ureau of Labor Statistics on survey -- consumer expenditure survey. 

I've worked in the past for the Federal Reserve on the survey of consumer 
finances. More may come to mind if 1 sat for a while. 

Q. Professor, did — 

You mentioned NHANES. Is that the same NHANES survey that the jury has heard 
about in this case? 

*13 A. Yes, it is. Although the version that I spent my time on is the 
current version, which is NHANES; III, and I believe the version that the jury 
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has heard about is NHANES 1, which is an cider version of it. It's -- the same 
croup of people conducted the survey. The survey is essentially the same survey, 
it is all essentially the same. 

Q. Do any of those surveys that you consulted on with the federal government 
on «e multiple imputation to fill in missing values? 

fA. Ytes, they do. In fact NHANES, the new database for NHANES that's going -- 
wil^ya^released within a few months is multiply imputed. In fact the current 
veripi^n of NHANES, they just leave the missing data blanks because they’ve 
n«»rfrteri..jrhat the old way of doing it was inadequate and unreliable, and so the 
verfSo^fchat, I understand at least, is the — of NHANES that's going to be 
avaOabls, either will be with nothing done to the missing data, just leave them 
blaffK^or being multiply, imputed in a — in a project that I helped design. 

ISslSilv I underst at^ MS- I think the Fatal Accident Reporting System may be 
released with multiple imputations for blood-alcohol content. It's been done and 
I t|BPiP%t ’ s being tf bns ^dered because it's considered an improvement over the 
currentJ&ays of dciijfi^ fetJ . There is various — 

jffiypre's a couplec^jlensus Bureau public use files that use multiple 
fmppi flftsa^ on being doi^ ^ e^roht years ago. I understand that the survey of consumer 
finances uses multi^^^^nputation for its imputations of missing values. 

S O. Professor, yipjtlllked about multiple imputation, and we don't need a 
response. pggp^f|/hat — what’s the gist of it? 

«.l the e ssenfiall idea is that if you have a missing value so you don't 
t sorr |frcff ing rffsf^jand you put in one number and then act as if that number 
, thaiPlan' toffee a reliable thing to do because you know it's not right. 
You don't k J fcjB Mfr tat fe^jg^number is. If you knew what it was, it wouldn't be 
mis p&jpBp^ So the ess |TntTi| l idea of multiple imputation is actually a very 
intuiEIpe one, whicf|§Pii|instead of filling in one value, you fill^n two or more 
to rea&eql the uncerta|atv about what the right value to fill in is, and the 
triciQ^^that bv fijffitlnqfc in two or more you can actually get reliable answers 


I tfRH^^t ’ s being (g Sns 
currentjways of aoi tfi 
JPy$ire ’ s a coupleof 
tmppggjg^on being dorftS kjg 
finances uses multi{jU^|S 

S q. Professor, 
techr ^te C. response, pig|i 
«.l the jess entl i 
t son ^n^ ing nPffr 

You don't k^feSMibat 
mis p^spi So the ess ffntT 
int’u ifcXal e one, whicl||IPl 
to reareal the uncertain 
cric OSl ^tftat bv fi j^ffn 


out the end that [refl ect the fact that you didn't know what that number was. 

Qs^Srf missing daTaT^re imputed without using prjicejjures that have been 
aevek^d in the st frfefasrig ical literature, including(voup>normal imputation 
techr^^e, what are^^consequences for the data themselves? 

^^^Sell the datp^^Sjues themselves can’t represent what the — what the real 
datli|ll|| because yo^yyfilling in values, one value, and also the standard — 
not %ji ndard" but P=HPsrike "standard," I’m sorry -- inappropriate methods will 
creaf#®j!|mputations that are not accurate, they're — they’re unreliable. 
MoreOT^, the use of the data set will not yield reliable answers. The point 
estii^tes. the estimates of — of the quantities that come out from the analysis 
willTfiHwrong, and always — "wrong," wrong in general, it will not be reliable 
— a^f^|he confidence intervals, the plus or minus that you put on estimates 
will be too short because you're pretending as if that value that you put 

in was true when in fact you know it’s not true, there — there's uncertainty 
there, 

*14 Q. Professor, did the plaintiffs confront a problem with missing data ir. 
the data sets that they used in this case? 

A. Yes, they did. 

Q. Now earlier you talked about two kinds of missing data problems, one 
where somebody just doesn't answer at all, and one where a person maybe skips a 
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couple of questions- What kind of missing data problems did the plaintiffs 
confront in the data sets that they used here? 

A. They confronted both, both kinds. 

Q. Did you investigate the missing data problems in the data that the 
plaintiffs used and also how those problems were addressed? 
vA. Ofes, X did. 

f Q. JJe have a rather large demonstrative, but let me ask a few foundation 
qu^ggaptis. And I think you '"11 find this, professor - - there's a book up there 
tab 32, and it's Exhibit No. X3085A as in apple, 
you prepare this exhibit, professor? 

I did, although the colorful display was done not by me. 

:s Exhibit X3085A a. summary of the missing data and the various data se: 
the plaintiif£a',model in this case? 
fes, I belieapPPf is. 

J[s it an accurate summary? 


us 





certainly 
Jill it ass:' 
res, I beli< 
BIERSTEKER: 
both an' illustrate 
$R. HAMLIN: Yoi 
exhifeS$&t“^We don't 

not ^gmpxete,-' ‘ 

fflEj C0UR3 
X3$$ c 5fv^ ± 

MR. BIlHtPP§KER:| 
COURT: You 
BIERSTEKER:* 
BIERSTEKER: 
HAMLIN: Yoi 
#*RA BIERSTEKER: 
HamlsKir could you 

COURT: Sur< 
MRs§ BIERSTEKER: 
Honor, coi 
— that's 
rry. Profess: 


© 



so. 

>e jury in understanding your testimony? 
will. 

Honor, I would offer Exhibit X3085A into evidence as 
iibit and as a summary under Rule 1006. 

ior, we would agree to the admission as an illustrative 
[that it is a summary of the data sets. It certainly is 
I accurate. 

, Well at this time I'll allow it as an illustrative, 

is so big. Your Honor, I’m not sure how to do this. 
tc need a smaller exhibit or a bigger stand, I guess, 
the exhibit gets smaller, no one can see it. 
link that will be all right. Is that okay? 
lor, may I — 

(be if the professor could come down, and then maybe, Mr. 
?r there. Would that be all right. Your Honor? 

ik you. 

su make sure that the jury can — I think they can see 
Okay. 

lis is a — 


COURT: Excuse me, counsel. Just a word of caution, professor, if you’re 
use this for illustrative purposes, keep in mind that the reporter 
take down your conversation, everything you say, so if you start 
your back on the reporter, sometimes it's difficult to take that down. 
JITNESS: I understand. 

:oURT: Okay? Thank you. 

THE WITNESS: Thank you. Your Honor. 

BY MR. BIERSTEKER: 

Q. Professor, this is a pretty busy exhibit with lots of colors and lots of 
other things, and so I'd like you to, pretty slowly, explain what it is. 

First, what are all these different rows in the exhibit? 

A. Each row corresponds to a data set that was used by the plaintiffs for 
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some purpose. So, for example, the first row corresponds to the NMES^ the 
national data — medical examination database, complete with the supplement. The 
second row is that database without the supplemental suirvey. The next data set 
is BRFSS, the Minnesota telephone survey. The next set of -- the next row 
corresponds to the billing enrollment, the Medicaid data for Minnesota. Then the 
Blu^Cr&ss Blue Shield data for Minnesota. And then the bottom row corresccr.ds 
the NHANES data set. 

Q. Okay. So these are the data sets going down. What are all the items 
fferent columns of the exhibit? 

: ach column refers to a collection of — of factors or variables that 
somewhere in some of the plaintiffs’ models. So, for example, the 
of column^refer'to age and gender, the second column refers to race, 
smoking sta tj^d And then at the far end, for example, there's self¬ 
health stal. 

boxes? What does that mean? 
respond to data that is entirely missing, so it — 
box, it means that that survey did not collect any 
le. 

green boxe s, what are they? 
fer to" places where data may have been available but 
tiffs' models. 

he white squares, what are they? 

indicate where — where data were available, and the 
s, when there are percentages in the boxes, indicate 
ta that are there. So where there are no percentages, 
mplete, there’s no missing data. 

'you look at age and gender, in all the databases 

age and gender reported. And in — in almost all the 
always reported. Then as you get to some of the other 
,hat they're not available in most of the databases, 
ly — they’re either entirely missing or partially 

lk about some variables that are used in the model, 
edical expenditure data in the plaintiffs' national data 




was 



at do all 
are all th 
he blue box 
there’s a 
>rmaf?ion on that 
And what abo 
ie green bo 
fibbed in the 

whPlNare 
pnHhe sq 

per cent ages^ia^®hos 
tent of miss 
are basica 
for example 
flly everybo 
‘s race — r 
is, you can 
;— they're 

et's — let 
rst talk ab| 

NMES data 

O o the NMES data set with supplement is the first row, and the NMES data 
supplement is the second row, and the total medical expenditure 
info ffaiat ion is this column, and this indicates that in the NMES with supplement, 
thatiN^R percent of the values of medical expenditure, 53 percent of those values 
in in that survey were missing. And in the NMES without the supplement, 

54 pf&ssissSt of the values were missing. 

Q. Who filled in the values for expenditures in the National Medical 
Expenditure Survey when they were missing over half the time? 

A. That was done by the agency that produced NMES. 

Q. Do you know whether or not there was expenditure information missing from 
the BRFSS, the Minnesota telephone survey, that the plaintiffs used? 

A. Well if you look at BRFSS, the medical expenditure information is — is 
all blue. : 
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Q. And that's -- that means it's missing? 

A. It's — it's all missing. Everything’s missing. 

Q. Let's talk about another example, let's talk about smoking. 

A. Okay. 

Q. What was the status of the data in the NMES data set with respect to 
smoSlng\ 

wUJ^ell in the -- 

rlPr those people who took NMES without the supplement, there's no smoking 
inf^maticn, so that's why that’s entirely blue, and in the — for those people 
whopWP| NMES with the supplement, it’s either five percent or nine percent 
missing^ depending upon how "smoking status" is defined. 

. What do |smi mean, how smoking status was defined? 
y^arfell if you footnote A, we could see exactly how -- what the 

def ^nitx on is. And fKse footnotes refer to how the plaintiffs took care of 
misp^Hl^data in the pvar^&sus places, and because it's such a collection cf 
differing techniquesLlShpre's a collection of footnotes. 

jUkgipkay. There anrl^sbme of the footnotes {handing the witness a piece cf 


misplit|p&iata in the pvaffc 
differj^g technique 

So in this — 
codee^jaaB.. ever smokej^pgds 
statist# coded as guifrep 

C2:ussi5^|)ff m* :< 

Q. Now 8^-Mf^teoD^BA Jfeab 
rhotu^i(> there? 


refers to — it's five percent when smoking status is 
w pfe us never smoker, and it's nine percent when smoking 
ujjrep t, former or never smoker. And it says further in this 

f^lje record. ) 

^.Jufab didn't respond to the supplement in NMES, how many of 


samplj 


peop 




|e there? 

|ell this shdPS 
•hat — in the" 
|nd is that' 500 
/ho did respi te 
Ihat 1 s correlT 
prefers to ttka.. 


Mpout 2,240, which is about 10 percent of the — of the 
s&iginal NMES. 

Edition to the percentage missing in the -- among the 

ri 

“Sp these are all missing, and these — this five or nine 
ifefES with the supplement. 


cla inff^ ata for Blu^SSH^ss Blue Shield and Minnesota? 

WWt|ell we'd ha^My-y^ look to see. In here is the claims data, this is for 
Mediate and this iJ^fflxl Blue Cross Blue Shield, and the blue meaning that 
smoki^status was not available. 

^^sll right. Let's talk about insurance status for a moment. Was insurance 
info^^tion missing for some of the people in plaintiffs' national data set? 

mm'U block somebody no matter where I stand, so — 

Jill's health insurance, and indicates it's for private and public. In NMES 
with pfel^upplement it's missing about a quarter of the time, and in the piece 
that's without the supplement it's missing about half the time. And BRFSS, BRFSS 
it's missing — private indicator is missing 30 percent of the time and public 
indicator is missing 90 percent of the time. So these are the — the — the 
reports that come from the survey. 

Q. Does it — 

Does it make any difference, for example, that the plaintiffs ■ model is 
missing smoking information for the claims data, but they have it someplace else 
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in one of their other data sets? 

A. Yes, it certainly does make a difference because ideally this — here's 
the -- here’s the data in Minnesota, .the — the billing enrollment Medicaid data 
and the Blue Cross Blue Shield data, and ideally you'd like to have all this 
filled in perfectly, like to have all this white, which is with no — with no 
mislFlngXdata. Then you can — you could do the kind of calculations and 
adj ustme nts that the plaintiffs are trying to do without having to try to piece 
tog<P§lSKt all kinds of information from all over the place. 

jPO-lf plaintiffs had all white boxes in the claims data, could they have 
’donjPi$8§|t they did here without NMES and without the NMES — without the -— 

supplement and without BRFSS? 

Ia. Yes. I mean the purpose of these other data sets and the analyses on 
Jher data to compensate for the lack of information these 

databases prthe 
these database 

nedical expenditure information here that was filled i: 
For these other categories, the missing smoking and 
appear on this table, the insurance, marital status, 
overweight, self-reported health status, who filled in 



lack of gathering information to fill in these blue 


3w we menti 
government a 
jr categorie 
In, seatbelt 
those values? 

R>-*Th e plaintif 
■^EilTNESS: I 
the fMd, 

p§||§gfCOU RlfiM^h y 
THE CLE^f^^iour 
less taken. ) 
|cLERK: All 

ry enters the 
1CLERK: Plea 



BY Mi- 


out, 




we can put this down and the professor can return to 
o leave it in front here, 
we take a short recess at this time then, 
ds in recess. 



Court is again in 

troom.} 

seated. 


session. 


COURT: Couni 
BIERSTEKER: you, Your Honor. 

IERSTEKER: 

rofessor, ojHWrcher thing on this chart I didn't mean to — to ieave 
that is ea^J^pg^ne there’s a percentage and sometimes even when there 

Do those refer to footnotes? 


isnPfpiihere ’ s also ! a Ho tter. 

O hat do the footnotes reflect? 

hat’s right. The footnotes reflect the different ways the missing values 
were fimputed. 

you have an opinion about whether — 

W01&, why are there so many? 

ffiKs s jS a & l the reason there are so many is there are all kinds of different ad 
hoc means used to impute the different variables in different surveys, so it's a 
hodgepodge of different things, and so there — there are many, many footnotes 
because there’s a whole collection of different things they’re doing in an 
unprincipled way. 

Q. Do you have an opinion about whether the method — I guess methods used 
to fill in the missing or unknown values in jthe plaintiffs’ data sets were valid 
or invalid? 
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A. They're not valid. 

Q. What are the consequences because the methods used to fill in those 
missing values were invalid? 

A. Well first, clearly, the values themselves are not correct. Secondly, t 
point estimates, the estimates of any — that come out of any analysis, the 
basjc number is nnr cannot be trusted. And moreover, the plus and 

mi n Wc ,rf^he confidence interval, the relative error that's attached to an 
estjjffite is too small , so the certainty with which something is believed is 

f alfielvjprecise. 

jW^io I want you to — 

NfJell professor, you reviewed the testimony of Dr. Wyant? 

^Mes, I did. 

kffiapft nd did he d is&asfit s sampling error, the plus or minus on the plaintiffs' 
di fferen t estimates?^ ™ 
pi<iWj!tes, he did.p- 

Q. Jlkay. I want^^^Jto assume, professor, that Dr. Wyant testified that th< 
errs$iy$|t|he plus or WS on the plaintiffs' estimate for major tobacco-related 
dia ealj/a^ like lung pgg§u|^r or the CHD/stroke, was approximately 8 2 percent of h 
estimate at the 95 fa&ffi ant confidence level. What effect does the invalid way , 
which the plaintiff l>f^®lled in the missing values in their data set have on th< 
truefj&iius or minus plaintiffs' estimates? 


0 . ' 

which t 
true^fe 


test 




HR. HAMLIN: i 
y. Hrad n| 
BIER$^«£ER:j 


uj to assume, professor, that Dr. Wyant testified that the 
fxfl on the plaintiffs' estimate for major tobacco-related 
H^r or the CHD/stroke, was approximately 82 percent of his 
Rant confidence level. What effect does the invalid way in 
Tiled in the missing values in their data set have on the 

a e plaintiffs' estimates? 

ction. Your Honor, I believe that misstates Dr. Wyant's 
ve a 95 percent confidence interval, 
ur Honor, I'd be happy to refer to the page. If you'd turn 


to — 


thro 



1 I read it ^ Yoq r Honor? 

COURT: Can fiiHajive us the page first? 

3IERSTEKER: YfeR. It's paae 5651, starting at line 19 and continuing 
line 25. 

he questionp““j 

is from Drt™^y^nt's trial testimony on February 27th. 
stion: What^^^^like to do is, if we use the 95 percent confidence 
he plus or minus on this 558-million-dollar estimate for major smoking- 
able diseaspl^iiS|enditures would be approximately twice as great. Instead 
rcent, it would i be 82 percent; correct? 


u 


pg^testion: What^^^^like to do is, if we use the 95 percent confidence 
levems§#he P^ us or on this 558-million-dollar estimate for major smoking- 

attyjllutable diseas|W^|enditures would be approximately twice as great. Instead 
of rffPSkrcent, it wim|L|ybe 82 percent; correct? 

^iSrat is correcJc?™! 

##11*. HAMLIN: Your Honor, that came on cross-examination, and that is an 
accusal statement of his testimony. 

fHE ? COURT: Go ahead. 

BY >#^%IERSTEKER: 

$$Pf^ssuming that at the 95 percent level the plaintiffs have given testimony 
thatp&$S»^plus or minus on their estimate of health- care expenditures 
attributable to smoking is 82 percent of that estimate, what effect does the 
plaintiffs’ failure — well, what effect does the failure to use valid methods 
to fill in the missing data have on the plus or minus on plaintiffs' dollar 
estimate? 

A. Well the plus or minus has to be substantially too small. It has to be — 
has to be larger just as the sampling error has to be larger, because there is 
less real information in the data than represented by the filled-in data. The 
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:ne 





filled-in values aren't real numbers. The expenditures, 
values are filled in, they’re not real. 

Q. So the 82 percent estimate of the plus or minus on the dollar estimate is 
— actually understates the true plus or minus? 

A. Absolutely. If -- if you put the plus or minus dollars on it reflecting- 
irtainty that's attached to those missing values, it has to go up because 
iCting as if the sample size for that information on medical expenditures 
twice as big as it really is. You have half the data there. 

All right, professor, I want you to assume that Dr. Wyant testified that 
or minus, the error for plaintiffs' diminished health status pure model 
it 84 percent of their estimate at the 95 percent confidence level. What 
ffect of the invalid filling in of missing information in plaintiffs' 
on the pllfcs^ojr; minus of the dollar estimate in that part of their 

Essentially 
real with 
it they hav 

f na the real 
t would be 
— is actual 
The actual p 
%want you t 
Fs ’ 

)8 peS t a 
l the errl ct 
dollar plus^o^minu 
pspi i | A. Again it 
was ffihjpff 306 percen 
Three hundred 
as applie 
perhi$>s]| substantial 
CWCet's talk a 
prof^BH&r, that Dr. 
or m faaal there was 
^dence level 
^imate as we 
fes, in the 
has to be. 

o you know if there are additional sources of uncertainty in plaintiffs' 
due to sampling that the calculations of the plus or minus discussed in 
it's testimony don't reflect? 

i, I believe so. I — I believe that the fact that the BRFSS, the 
telephone survey, is a sample and not a census was not taken into 
account in their calculations of plus and minus values, the relative errors. 

Q. What would happen to the calculation of that plus or minus if the 
uncertainty, due to their use of the BRFSS survey — never mind the filling in 
of the missing values — the use of the BRFSS survey, were taken into account? 

A. Well if they reflected properly the uncertainty, the plus of minus would 
increase, again the relative errors would get larger, the plus and minus values 



same answer. They filled in values and treated them as if 
counting for the uncertainty, so therefore the plus or 
oo small, 
or minus — 
r r. 

minus has to be larger. 

me that Dr. Wyant testified that the error on 
their diminished health status mixed estimate was 
f their estimate at the 95 percent confidence level, 
the invalid filling in of the data on the plus -- the 
that estimate? 

be even larger. The plus or minus that they report — 
ht percent. 

jthe dollars, that is — is less than it should be, 
s than it should be. 

laintiffs' nursing home estimate. I want you to assume, 
testified that the error for nursing homes, the plus 
352 percent of their estimate at the 95 percent level 
the — does the failure to validly impute data affect 

ay, that the plus and minus is too small. The real plus 



Minn 
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would — would get bigger. 

Q. Professor, we — we talked about sampling error. Did you investigate 
other potential sources of uncertainty or error in the plaintiffs’ damage 
estimates beyond sampling error? 

A. Yes. There are non-sampling errors. 

And what is non-sampling error? 

f A. |on-sampling error refers to bias or potential bias in the estimates. 
It’llipft the — not the kind of uncertainty that goes away as the sample size 
beqpmes bigger because it's just a bias, you're getting a wrong answer, you're 
the wrong thing. And the kinds of imputations, invalid imputations 
th<5Tdia is one source of bias, and the — and the other kind of bias that we — 
we jssalkAd about earlier, referred to earlier, was the bias that comes from using 
in anDwHfe riate modell Win appropriate analyses to do adjustments for these 
bac|ilPl|nd factors. 

^hat do statisticians do when they realize they have a biased model? 
rX^hen they hapre»*nbiased model, they should try to fix the model, they 
sho ul d Trv to eitheiy&asdjifv the model or choose a model or mode of analysis 


t do star 
^A^phen they ha 
should ^ ry to eithe 
thagjlP^ore robust, 
— pilli|inalysis is 
kinds of models tha 
espep^ally in this 
we tfegli&Srd about ear 
bett 4fegt.d ata; ' s 

Q. s 

^fF^well^ tnei e ai 
sensiti 5 vit}Fl?ral.yse^ 
ch^ Tn—vaj 
aifflWpnt ways. You 
modeJ|%ke sense an 
Ther^iSlre a variety 
the f^a^r s. | 

W^ rofessor, w 
CKD/pS^pke model b^ 
P^lfes. That’s | 
psaSbreasjjft nd -- and y 
suoi Sfcnub s in the P^l 


g ust" using it in the technical way, which means it's r.ot 
ensitive to minor deviations from the model. And the 
intiffs used are — are quite sensitive to deviations, 
xt as revealed by those propensity score analyses that 
So statisticians try to fix the model. Or sometimes get 
iaj^ag her way to go. 

Uforlticians investigate whether a model is biased? 
lifca variety of ways of — of -- of doing it. You can look at 
riNifre you change model specifications to see how answers 
uyyiles, bring out variables. You can put variables in in 
Ifflrflook in subgroups to see whether the predictions from a 
ij p& edict the values that — that you’re trying to predict. 
IsE^isodel diagnostics that statisticians have developed over 


Q^ rofessor, would] Dr. Wecker's addition of exercise to the plaintiffs' 
CKD/p^bke model b^SiB^Ixample of a sensitivity analysis? 

ffimf l'es. That’s jyyi t||Lample of a sensitivity analysis. 

-- and jTOSHrfentioned looking at subgroups. What — what are the 
subg tnojj fos in the p3|eala^4-ffs * model? 

^^fwell, for example, you could look at the answers within the plaintiffs' 
age ^Hsex cells, within those six age by sex cells, and see whether the answers 
make?lfnse as you look across the cells. You could look into finer breakdowns, 
if v k^j wanted to, to see whether the model's predictions are consistent with the 
data IS d y° u would expect to see. 

general, professor, have you performed these kinds of analyses in your 
own ^^ned work? 

A. In my own applied work? Absolutely. They’re obligatory. You can't have 
faith in a model unless you've looked at what the consequences assumed in that 
model are. 

Q. What is the law of averages? 

A. The law of averages is a — is a name for something that's usually called 
the law of large numbers, in whiph — and what it refers to is the fact that if 
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you — in a situation where you have identical occurrences of something, like 
coin tosses, the more coin tosses you have, the more stable the average will be. 
So the -- as the number of events gets larger, the average looks more and more 
like the true value. And so -- 

You have to have the underlying thing that’s happening, though, being — 
called identically distributed. They have to come from the same 
:rilJution, like a coin toss. 

Jell professor, did Dr. Wyant discuss the law of large numbers in his -- 
Law of averages in his testimony? 
lYes, he did. 

’I want you to assume that Dr. Wyant testified that the law of averages or 
jmbers makes it inappropriate to look at the different age and gender 

34-year-old males in the plaintiffs’ model, 
sume that? 


inion about whether the law of averages makes it 
inds of comparisons, for example, for 19- to 34-year- 
model? 
opinion, 
ion? 

oesn't have anything to say about whether it's 
hose subgroups. It’s a misstatement of the — of the law- 
law of averages. 

aw of large numbers apply to plaintiffs' estimate for 
nder groups in their model? 

t reason is that, fundamentally, as they mentioned, this 
ly distributed events where they're like coin tosses or 
rom a — from a population. These four of six cells are 
the law just does not apply. 

.s another kind of obvious one in that four of six are net 
he law of large numbers applies as — as the sample size 
ger, and four of the six aren't big, even if they were 
not. 

on your examination of the plaintiffs' model in this 
e effect of plaintiffs’ health-care expenditures ©■ 
onduct? 



-late 
nui 
Jhy 

th e dif fer 

M pS j jB a. Well th 
law %pjflies to ide 
rand<3|n]|y chosen peo, 
not VKe that at a 
#h& second rea 
exacwf large numbers. 
getsfSIiijg, bigger a 
iderjEg&tal, which t 
|Prof essor, 

Ld it estim 
jints' allege 
Jo, it did not. 





HAMLIN: Objection, Your Honor, I believe that that is violative of \he 
order. 

COURT: The objection is sustained. The answer will be stricken. 
LERSTEKER: 

rof essor, do you supervise graduate student’s doctoral dissertations in 
statistics at Harvard? 

A. Yes, I do. 

Q. Do you supervise graduate student's doctoral dissertations in other 
departments within the university? 

A. Yes. I help supervise some — some others in other departments. 

Q. What do you attempt to do. when you supervise a graduate student who's 

- I 
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writing a Ph.D. thesis? 

A. Well in statistics I — I — 

First of all, in any field, I try to make sure that the application or use 
of statistics is -- is valid. If it's a dissertation in statistics, then there 
often should be some component of original statistical thinking taking place. If 
it ^ aVhesis in another area, such as economics or health-care policy, I try to 
ass ure J that the use of statistics is — is valid. 

^jp^Does the plaintiffs' model in this case use valid statistical techniques 
to Preach their estimates? 

«§§!%>, it does not. 

®^MR. BIERSTEKER: I have nothing further at this time. Your Honor. 


* HAMLIN: Yousr 
COURT: Do ypyE 
HAMLIN: Yes^ 
^^^COURT: All pi 
(Laughter.) M 
nypp CLERK: Cour|P 
jj|ypess taken.} jgg 
th® CLERK: All j|T 
{Jury enters thp^ 
ITHC CLERK: Pleated 
Wf^OURT: Courfse! 
tfWk HAMLjgN^ Thoi^g 


Honor, we _ need a few minutes to set up our cross. 
|sgg4nt to take a short recess? 

Could we take five minutes? 

ghfe. We'll take a five- to seven-minute recess. 

rj 

Pfinds in recess. 


mMsmt mor?£gffl^. 


! Court is again in session, 
rtroom.) 

| seated. 

|u, Your Honor. 


[rning. ") 
iirning. 

i 

i 

sfessor Rubin. 

imlin. I'm one of the attorneys for the state of 

> Blue Shield of Minnesota. 

ight? 


parked as a consultant for the tobacco companies in other 
reet? 


< r.ai ” Go §fflffl§ rning . " ) a 

WITNESS: Gdoarmrorning. 

Bjf^ood morningTN ^o fessor Rubin. 

SlUjBood mo mi nggatf^* 

Q^My name is ’EtSTTramlin. I'm one of the attorneys for the state of 
Minn^afca and Blue Blue Shield of Minnesota. 

JEjue met befoi ^^jy lght? 

ftTlf es, we haverr®^^ 

^^%ir, you ha\bm^rked as a consultant for the tobacco companies in other 
lif®pP8|Lon cases; gorfrep t? 

SiWou have worked as a consultant in the Medicaid suit brought by the state 
of is that right? 

#22 A. Yes, I did. 

IPW^nd you have worked as a consultant for the tobacco companies in the 
Medi^S^d suit brought by the state of Mississippi; correct? 

§l»|ig#rrect. 

Q. You've also worked as a consultant in the Medicaid suit brought by the 
state of Florida; correct? 

A. Correct. 

Q. And you are currently working on Medicaid suits brought by the state of 
Oklahoma and Washington; correct? 

A. Correct. 

Q. Are you working on any other cases as a consultant for the tobacco 
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Pace 



was for 
could 


companies? 

A. No. 

Q. And when were you first retained in the Minnesota case to review the 
plaintiffs* model? 

A. I'm trying to remember. The first contact on any case, I believe, 
nd that was perhaps nine months ago. That’s my memory. But I — 
could be off by a little bit. 
f And the Minnesota case — 
as — 

- came after that? 

believe so. I believe the first connect -- contact was with respect tc 
of these -other ones came up, which is because of the 
e issues. 

rking as a consultant in the Minnesota case for 
months? 
correct. 

ulty member of the department of statistics at Harvard 



n 


ind then so: 

>ss in some 
so you've b 
soi to&aa g less tha 
believe t 
jgjQ^'Jow you are 
Un fvalrsk tv; correct 
P®sP®fjorrect. 

Q. Now at Harv^m|li|.hey also have a department of biostatistics; correct? 
..Correct. fau n Miiorni 

u’re not a^Hiber of the faculty of the department of biostatistics 
arv|j»sL; corp«&Mu_ 

Jmfc orreite 

F Q. Sir™ pi ndica tions generally deal with statistical methods and 

g ^wstatif^^^l theories as opposed to applying methods to actual 
s; correct; 

— I woulcO ^Z^ haracterize it necessarily that way. That’s — 
e’s certairJ^ili*' preponderance of publications that are like that, bu: 
re also quitf§“"S" s tew publications that are doing applications. Maybe it 
haracteriz^tron, maybe not, but — 



XAJ. just stop |therj 

g%o your ansv^Ws maybe it’s a fair characterization and maybe not? 

ISf^L think the would answer that probably would depend upon what I’m 

on most re ^ehftly and what was on the top of my mind. But certainly I — 
if' v ^sttfas hsk the que^p|^ISl| again, perhaps I could give a sharper answer. 

Qai&Sure. My question is that — that the great majority of your publications 
rt * >a1 TffiF h recommending statistical methods and proposing new statistical 
theq®ies as opposed to actually applying statistical methods to databases,- 

corxpm^? 

^That’s an accurate description of the publications I have, that’s 

eori 

bw you're not an expert in the modeling of health-care problems; 

correct? 

A. I wouldn't claim I'm an expert in that, no. 

-Q. And you did not consult with an epidemiologist about this case; correct? 
A. Not formally, no. 

Q. Well, did you talk to an epidemiologist about your opinions in this case? 
A. In casual ways. I have friends who are empidemiologists, and the topic 
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has probably arisen. 

*23 Q. But have you gone to an epidemiologist and said here are my opinions, 
I want to discuss them with you? 

A. No, I have not. 

Q. You’ve not consulted with a health economist about your opinions in this 
casW: correct? ' 

1,A. J n that formal sense, that's correct. 

TpW^re you also talking now about informal conversations — 
fA. % hat's correct. 

piW§|- or casual conversations you may have had with friends and colleagues? 
&»A. That's correct. 

ou did not consult-with any medical doctors; correct? 
i n a formal sense again. 

consulted with Dr. Wecker about this case; correct? 
no* consulted in — 

arrived here in Minnesota in the last two days, we've 
^before that, no. 

jnsulted with Dr. Brian McCall, who is another one of 
rts, regarding your opinions in this case; right? 



hat's corre 
ow you've n 
rrect. Wei 
talked si 
riefly, but 
ow you’ve n 
ts' damagesfe 
A. That's corre 


^ h^An d you ' ve ni 
slmieii in this d 




nsulted with any of the experts hired by the tobacco 
egarding your opinions; correct? 


Gen era iJ! 





were asked to prepare a model to measure the health-care 
ses caused by smoking. You would use the Surgeon 
onclude that smoking causes disease; correct? 

the conclusions from that report, yes. 
nd it, your area of expertise is not in choosing 
be included in such a model as the one I’ve just 



compjl 

M£orie^ 
sdlaBssfl ow 

costs for t^&yy.ng 
s reports tl 
J ure, I wou] 
yPKow as I undi^ 
vari^^s or factory 
desccH&d; -- 

That’s not 
■- correct' 

That' s com 

> kay. I mea rib^Bsa sg area is how to adjust for background factors; right? 
P^p Pi phat — that f'sfcg rrect. 

tfespiNow would ycpMS^sult with an epidemiologist regarding the relationship 
betweg^. smoking and disease if you were asked to prepare a model to estimate the 
smok|j|p*-attributable expenditures in a — in a state like Minnesota? 

f. Y es. 

jg^^nd would you also consult with someone who knows far more about 
datafeiig^s than you do? 

uTwould you consult with a health economist who had experience with health¬ 
care expenditures for smoking-related diseases? 

A. In order to do an analysis, yes. 

Q. Yes, we're talking about basically doing an estimate of smoking- 
attributable expenditures. 

A. Yes. 

Q. Now you would also want someone whose career and expertise were focused 
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solely on biostatistics as part of your team; correct? 

A. That's a more difficult question, but I certainly would welcome the -- 
the help of -- of anyone who's — who's qualified. And putting together a team 
of the best qualified people is important. 

_Q. Well wouldn't you want someone on your team of experts who had expertise 
ea of biostatistics? 
re. Yes. 

mean you haven’t had a focus in your career on biostatistics; correct? 
haven't had a — a focus on it, but I certainly do things that are 
to biostatistics, and some people think of me as — as a 
tician type who has qualifications in it. So I just want to be clear cr. 
r to your cj^estion. 

. Well my qi^»|s$Lon to you, sir, was a simple one, and that is: Would 

*pf?ciai 



you|wan1? a biostati 
modi 


in on your team if you were asked to prepare such a 


A. ms. 


ca 





mely limited familiarity with the topic of the health* 
rrect? 


w you have 
s of smoki 
torrect. 

Q. And you’ve n 

4^^ rrect ' 

OFT Now o nce y ou : v 
attractable Ip^flend 
as pdipsaup; rppft? 

A. Corr«&cpsra| 

"ow are you 
on with the 
es. You mean 
rk? pse, , 

ell let me |sJc' you individually. Are you aware that Dr. Jonathan Samet 

has contributed to numerous Surgeon General’s reports 


suited with anyone about that topic; correct? 

assembled this team to do this estimate of smoking* 

you would want them to work together, to collaborate 


iar with the plaintiffs' experts who worked in 
es model? 

I know them all personally or familiar with — with 


idemiologisi 
ng-related 
es, I'm awaj 
re you awarj 
ealth, whicf 


|se? 
that. 

|the fact that he was educated at the Harvard School of 
LsiOne of the leading schools of public health in the 


a es, I was aware of that. 

re you aware that he is chairman of the department of epidemiology at 
John ^ Hg pkins University? 

f^ow you mentioned that Dr. Scott Zeger was a student of yours. That -- 
that Wie^ several years ago; right? 

A. He took a course of mine in the '10s when he was a graduate student, 
that's correct. 

Q. In the '70s. 

A. Yes. 

Q. So perhaps 20 years ago. 

A. Correct. 

Q. And you realize that he is now chair of the department of biostatistics 
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at Johns Hopkins School of Public Health; right? 

A, Yes, I do. I know that. I 

Q. You know that he's the author of a number of articles on health-care 
issues. 

^Yes, I — I know that. I 

fQ. \nd are you aware that Dr. Leonard Miller is a health economist from the 
Un^g^ ^ jty of California at Berkeley who has worked for the Centers for Disease 
Cot|£rol, CDC, in estimating smoking-attributable expenditures in the United 

sz tmtm 

[TrYes, I'm aware of that. 

nd are you.aware that another one of plaintiffs' experts is Dr. Timothy 
ho is a bidstatistician from Johns Hopkins University with a special 
pUfases? 

e of that special expertise, but I was aware of the rest, 
testimony to this court and this jury? 



o 


fill 


is 




>e in large 
wasn't aw 
lid you rea 
res, I did. 

Do you recal s 
have no s 
fou don’t d 
I don't hav 
>w as a st 
irr 'fejCx T* 

»jEr es ' 

scffelfGSS'. 

Q. And % p r 
repip^Spf publishe 
^0 A. Yes- 

fact, look 
thrddj^ modeling; 

£6. jYes . 

%$rkow for a m 
of yil^urgeon Gen 
be t ptori ih into acco 
‘ ^Correct. 

Jow do you feeJlift& re that smoking causes lung cancer? 

I'm not an expert in that area. 

o you have a personal belief? 

Yeah, I believe it does. 

Do you believe that smoking causes laryngeal cancer? 

I believe it probably does. 

T’lDp you believe that smoking causes coronary heart disease in men and 

womd“~ 

A. I believe it's a contributor, probably. 

Q. Do you believe that smoking causes stroke? 

A. I believe it’s a contributor. 

Q. Do you believe that smoking causes oral cancer? 

A. 1 believe it's a contributor, all in the same sense as before, as a lay 
person, not as an expert in those areas. 

Q. Now when you say it's a contributor, my question to you was: Do you 
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t he testified about that special expertise? 
c recollection of it, but I don't doubt chat he did. 
hat he has expertise with large databases; correct? 
reason to doubt that if he represents that, 
clan you make use of prior information in modeling; 

s build models that are built on understanding of 

formation can include personal beliefs derived from 
rature and other sources; right? 

at prior information helps you in answering questions 


n health-care costs of smoking, you'd want to make use 
reports because they are prior information thac should 
orrect? 



j 
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believe that, for example, smoking is a major cause of lung cancer in men and 
women? 

MR. BIERSTEKER: Objection, Your Honor, it’s asked and answered. 

MR. HAMLIN: No, I — I think that — 

COURT: I think that the question is the "major cause" now, so I suppose 
answer that. He did answer that he believed it was a cause of lung 
I believe. 

Let me ask you this — 

WITNESS: Shall I answer? 

COURT: I'm sorry? 

WITNESS: Shall I answer, or — 

HAMLIN: I ’ l^withdraw the question. 

COURT: He ’ to re-ask the question. 

1 AMLIN: 

you belie 
res, I belie 
>o you belief!; 

believe it 
?o you beli 
I believe it 5 
?w you wou 
Eonalstudi^esj 
:orre fff 
Jow ifNwmn 
of the Surg^fBS^Sen 

^jj^hat’s not 
desiaiiited many of 
-oofcQiF parts of t 

revij&w/v although c 

%gpWe!l the pi 






oay 


n 



; correct? 
orrect. 
o in the 1 

es. But th 



at smoking is a cause of esophageal cancer? 

's a contributor. 

at smoking is a cause of chronic bronchitis? 
ably is. 

at smoking is a cause of emphysema? 

ee that statisticians should not ignore history or 
addressing difficult problems in science; correct? 

ion with your work in this case, you haven’t reviewed any 
reports that conclude that smoking causes disease; 

|correct, because you — I think the plaintiffs 
for me to look at, thousands of pages, and I tried to 
at least, so there's - - in some sense there’s been a 
inly not a reading of them. 

fs designated those Surgeon General's reports just = * 


re 


w days you have looked at the Surgeon General's reports 


re designated prior to my — at least some of them, I 
beli^%, were designated prior to my last deposition in Boston in the middle of 
Apri^^and so some of that material I tried to look through too. I mean I agree 
thei^Me thousands and thousands of pages, so one cannot read all — all of 
thatfWIut I tried to get a feeling for what's in that based on skimming of 
part^| 

Well sir, isn’t it true that you just glanced at a few pages of the 
Surgeon General’s reports prior to your deposition in Boston on April 13th? 

A.. Well I certainly didn't read every page. I — and I went through — 

Yeah, basically I glanced through pages of it, that's correct. 

Q. You have not made any kind of systematic study of the Surgeon General’s 
reports on smoking and health; right? 

A. Correct. 

0. And that played no part ih your formation of opinions in this case; 


Copr. © West 1998 No Claim to Orig. U.S. Govt. Works 


http://legacy.library.ucsf.ecfii^link^Qod|®fpQCMZXAp«Mv.industrydocuments. ucsf.edu/docs/xygl0001 


i-J 

vO 

CT> 

«£> 

00 




1998 WL 212591 


BEST IMAGE 


correct? 

A. Correct. Played no part. 

Q. Now if you were preparing a model to measure the health- care costs for 
treating diseases caused by smoking, you would consider prior studies of health¬ 
care costs of smoking; right? 

A. Correct. 

Q. And taking into account prior information from, say, the Surgeon 
General’s report and from studies about the health — health-care costs of 
SBHokj?og would make the model more accurate and more precise; correct? 

C AJ In general I would expect so, yes. 

Now did you prepare any drafts of your expert reports that were submitted 
|n: this case? 

Do you mean did I — was the — were — were the reports that I submitted 
ihe only thing I ever wrote and didn't write drafts before that? Yes, I prepared 
before that*. 

And did you T^ j ^g uss those drafts with anyone? 

IMBfefe who? L- 

I discusser tefeeiS with a person who was doing the computer work for me, 
w fep sab zks for me, — I discussed it with — with Peter Biersteker for — 

sure that things I was saying were both clear and accurate. 

P™lf And you crdarc^Pa number of drafts on your way to actually preparing a 
final expert repoip^^^ght? 

K ^A. Yes. Whenet jerl J write. I always create lots of drafts. 

|#%!^And as soorf^ilW|ou had a new draft, you threw the old one away; right? 
That^^what|^i|^|||ways do, yeah. 

Now me ^pl rect your attention, sir, to one of the notebooks that is or. 

^i^^heif^j m^ ront ^^ y ou, it's marked "NMES BRFSS and NHANES." Do you see that? 
A. . I *m ^ftrlry, the one up here? It's called 

Yes. That' ^^^^one. 

All right. 

Now I want jp^to turn to Trial Exhibit 18930. 

Yes. 

P Os- Now th is i$*-a—-sample design of the 1987 NMES household survey; correct? 

3- Yes. ._j 

Can we title page up. 

HAMLIN: Tt axs^J- jp already in evidence. Your Honor. 

f And this wH®p^pared by the Agency for Health Care Policy and Research; 

That’s whatitappears to be, yes. Yes. 

Have you ever reviewed this document, sir? 

It was — 

j^|gain, I think this was one of the documents that was designated, and I saw 
i Cffi£ obably a day and a half ago when I tried to glance through parts of it to 
g irt feeling for what’s inside. I did not carefully re — 

^^E^Was that the — 

Was that the first time you saw the document? 

A. I can't be sure. I may have seen it before. I just don’t remember. 

Q. You don’t have any specific memory of sitting down and reviewing this 
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document prior to Just a couple of days ago. 

*27 A. That's correct. 

Q, Now let me ask you couple of questions about the NMES household survey. 
Would you agree that the NMES household survey is a national probability sample 
of a civilian, non-institutionalized population living in communities across the 
United 'States? 

LAjif believe that's what it is. 

tPS%nd is the survey representative of the civilian, non- institutionalized 
oopfSlation of the United States in 1987? 


Representative" in a technical sense? It's designed to be, 


i r ' ? H 


s oestcr.ec 


to ©e representative. 

JUygpould you agree that the NMES household survey consists of approximately 

35 'fill# ersons lEM households? 

HfiT^hat' s what |$Psays here, so I have no reason to doubt that. 

you agree«*ha% the sample in NMES was -- was designed to provide a 
larger nB&presentati prTfBij population groups of special interest to the federal 
govj^upjlnt, includiP^plor and low-income people? 

ImilM nil believe I no ri-d ed that somewhere, and that's — that's — that's a 
comfriOT^ihing to do ||r|raderal surveys. 

Q. And that is PWPajct a characteristic of the NMES survey; right? 


says 


— 1 belielMto^iat' s right. I'd have — to be absolutely sure of that — 
fPygki're readingIsomething, then I would certainly agree that's what it 


ry^u're readingjsomething, then I would certainly agree that's what it 
Left,me direct your attention to the page that follows the 

.*_ . i_i__ _ i - _ _i__ ___ _ __i_» _t_..__< -* ■! 


title page*^Da^wou 

wfl iR M b'iffini ii i / i w w o • 

H1S& 11 right. T 

Wes, 

right. L 

na no 


page 


P°FF 

woul 




S3 

co C 


es, 1 see t! 
o you see t 
es. 

tates that 
on groups o, 
ve been obt! 



amilies." See that? 


me direct your attention to the page that follows the 
j^j|hat there's a column there marked "Background?" 

jgjd the middle column? 

ite direct your attention to the last paragraph on that 


sample is designed to provide a larger representation of 
cial policy interest to the Federal Government than 
from a random sample. These groups include poor and low 


A nd you have no reason to disagree with the accuracy of that statement? 
do y mm 

P|o, I do not. 

would you agree that information was obtained during the NMES survey 
about the families' health and health care? 

A. Yes. 

Q. Would you agree that information was obtained on illnesses, use of health 
services and health expenditures for each family member? 

A. I believe that's right. 

Q. Would you agree that there was a supplemental survey that was sent out 
that asked smoking questions? j 
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A. Ye s. 

Q. Now have you reviewed the NMES questionnaire? 

A. No, I have not. 

Q. Have you reviewed any specific NMES data? 

^. 1 — I personally have not reviewed any NMES data. 
fQ. Ss this done by your assistant? 

es ■ The computer disk that had NMES data on it, the review of that 
material, the analysis of that material was done by my assistant. 

Lo^Jould you agree that to verify and supplement information in NMES, there 
was reP|j !ere were two additional surveys, first was a medical provider survey; 


remember tf^ 
Okay. Let^ 
has^^^column mark#?: 
titlni™*|ji^usehold SufV| 
^^^ast full P a p|| 

^Now last senC 
inf opjijat ion provide pt 
II ifeiiibded two addpp 

dv is. .. Lj 

t Hafj F**® 
seerHrot. 

Q. It thuf 

phss^fe^ans, hosoitaos. 


Ph^ 

agen 


enfebetg 

01 

a 


used by thl 


orrect? 
have no re 
bsolutely 
nd you — 
you aware 
informati 



but — yes, 

^direct your attention to the bottom of the page which 
Background" on it, it's the last paragraph under the 

2 ' See that? 

>h ? ■ 


|, it says, "In order to verify and supplement the 
jhousehold respondents, the household component of NMES 
lal surveys." 


, "A Medical Prior Survey obtained information from the 
utpatient clinics, emergency rooms, and home health 
sehold sample during 1987." 


o disagree with the accuracy of that statement; right? 


here was also a health insurance plan survey that 
the private insurance of persons in the household 


SrPp o you agree that all survey components were designed to provide 
statjSPfically unbiased estimates? 

fL.pi rhat' s the standard design feature, that's correct, so I believe that was 
don^fgre. That’s survey jargon, just to be — it has a technical meaning, and 
I'm trying to clarify that. 

let me direct your attention to the page that has the column marked 
"Survey Samples." 

A. Yes. 

Q. First line is as follows, "All survey components were designed to provide 
statistically unbiased estimates." Do you see that? 

A- Yes. 

Q. You agree with that; right? 

A. Yes. And — and — with the technical sense, absolutely. That’s the way 
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NMES has information on illness? 
rmation on the use of health-care services; correct? 
ion on health-care expenditures; correct? 


char. 




the standard government surveys are designed. 

Q. Now let me direct your attention to the previous page, to the third 
column, specifically the first full paragraph that reads, "Together, the major 
components of NMES II contain information to make national estimates of health 
status, use of health services, insurance coverage, expenditures, and sources of 
pawnent, for the civilian population of the United States during the period from 
JaEuaryJ1 to December 31, 1987. Oversampling of population groups of special 
makes possible in-depth studies of these groups.” Do you see that? 
r~A. Yes. 

^Hifigree with those sentences? 

CaTt agree that's what they're doing — what they're trying to do, yes. 
ow you agrqe that -NMES has information on smoking; correct? 
he suppleme 
?he suppleme es, 

b^|es. Yes. 

' 0. ^0 you agre 
Yes. 

tod NMES has 
(es . 

Q. NMES has 
S^Yes. 

fact, 

*■ irist. J ~~ 

Chat 
„et 

do cumen t i 

It’s — it 

|6f- ]| 2 4 2 . Do you 
Wf^fYes, I do. 

The documen 
1992," do 
|Yes, I do. 
ht 's a docu: 
lent of Heal 
|Yes. 

,Let me direct your attention to the introduction, which is on the page 
Ites No. 901. 

I'm there. 

|You're there. 
s I'm there. 

cay. Let me direct your attention to the fifth full paragraph that 
Since 1984...." See that? 

Yes. 

It says, "Since 1984, the Minnesota Center for Health Statistics of the 
Minnesota Department of Health, entered a cooperative agreement with the Centers 
for Disease Control (CDC) to develop and implement the Behavioral Risk Factor 
Surveillance System." Do you agree with that? 

A. Yes. That's what it says.', > 





t know of any other national data set with all 
? 

xisting national data sets, correct, 
attention now to Trial Exhibit 242, 


bo«pMHS|5 you have — 


of tr.cse 


which is the first 


.986 





first document, 
that? 

titled "Behavioral Risk Factor Surveillance in Minnesota 
e that? 

repared by the Center for Health Statistics, Minnesota 
rrect? 


wit 


beg 
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Q. Goes on to say, "The surveillance system is a monthly telephone survey 
conducted throughout the year." Do you agree with that? 

A. That's what it says, yes. 

Q. Next sentence states, "All survey activity is conducted using a computer- 
assisted telephone interviewing system. The surveillance system uses a standard 
daW c&dlection instrument for assessing behavioral risk and a standard protocol 
fofc interviewing procedures and data collection. The survey instrument was 
fd to allow participating states to add state-specific questions." 

'Let^ me ask you this: Have you ever reviewed or seen the interviewing 
>nt that is used in the Minnesota Behavioral Risk Factor Survey? 
fou mean the questionnaire? Is that what you mean? 


ce 



?he questio: 
Jo, I have 
low are you 
ig the asse 
ies, I am a 
,et me dire 
’The sample 
lOO interview 
maintained at 3,42 
ta w Yes." 

you have 
.No, IfgeSS^i’t 
gJAnd 

report repi^&gywts 
thi^ 5 ™Xgport ™ sm 
|g, chronic 
fith that? 
iThat’s what 
You don't hi 

[>No, 1 do no 
Ido you agre 
^is to selec 
lion in Minn 
lln the tech 
Q. 

cori ^ ? 

fA. Correct. 


re. 



that Minnesota has added questions to that survey 
of general health status? 
f that . 


r attention to the bottom of the page. The last sentence 



of annual completed interviews has more than doubled 
1984 to 3,420 in 1988. The sample size has been 

that? 



al completed interviews through 1992." Do you see 
eason to doubt the accuracy of that statement? 





there's one final paragraph that states that "This 
n behavioral risk from 1986 to 1992. The risk factors in 
physical activity, overweight, hypertension, acute 
ng, drinking and driving, and seatbelt nonuse." Do you 


ays, yes. 

* y reason to dispute the accuracy of that statement; 


■actor 


t the sampling objective of the Behavioral Risk 
obability sample which accurately reflects the 

ta!? 

sense of a probability sample reflecting, yes. 

Now the Behavioral Risk Factor Survey has no expenditure information; 



rley ?" 1 


^S^And it has no disease information; correct? 

Correct, 

w prior to this case were you aware of the National Medical Expenditure 

Surrey? 

A. Yes, I — I'd heard of it. I was aware of it in a general way. 

Q. Prior to this case had you ever had occasion to use the National Medical 
Expenditure Survey? 

A. I don’t believe so. I say that as I believe, because it’s possible that I 
consulted on some project where people were using it, but — and maybe perhaps 
gave them some advice, but I myself was not a major user. 
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you 


Q. Now you haven't reviewed any of the individual Minnesota Medicaid ciarrr.s 
data used in this case; correct? 

A. Correct. 

Q. And you have not reviewed any of the individual Blue Cross Blue Shield of 
Minnesota individual claims information used in this case; right? 
orrect. 

ind would it be fair to say that since you haven't reviewed that data, 
ly have no idea what the relationship is between expenditures and 
disj in that data set — or those data sets? 

.xpenditures and disease in BRFSS you say? 

alking about the claims data for both the state of Minnesota — 

>h, I'm sorrj 

and Blue Shield of Minnesota. 

I'm sorry. 

sonally have not examined it, that's correct, 
you personally don't have any idea as to the 
enditures and disease in those claims data sets; 





[- and Blue 
, ... ihe claims d 
s — well 
Q. Jmd accordini 
reLfH&ygflship betweel 

WPihat ’ s do I 
From my personal 
if you wj 
costPTndMinnesota, 
SurvJfc righf^ 
p£ghrerha||h^If 
probably use safefea I' 
>e youcoul 
ill right. 
TShould I try 
ill right, 
vcare costs 
lational Me 
;f I were n 
databases,! 






[lyses of — of that — of those data, that's correct, 
rying to estimate the effect of smoking on health-care 
would make use of the National Medical Expenditure 

re were no other data that could be obtained, I would 
— I'm not sure what the scope of your question -- 
rase the question. I'm struggling a bit. 



xplain why I'm struggling? Would that help you? 

question is: If you were doing a model to estimate the 
oking-related diseases in Minnesota, would you make use 
Expenditure Survey? 

e to collect more information but just had to use 
. ... 1 would. 

ow if you v tot^ rvino to estimate the effect of smoking on health-care 
Minnesota,f"~vl^Iid you make use of the Behavioral Risk Factor Survey in 

a? pMSlf| 

JM^gain with the same understanding, if I could not collect more data, then 
I woi^ make use of that information, too. 

Now when you say "collect more data," are you talking about doing your 
own P8$fvey? 

« s. Collecting information, for example, on the public aid and private 
people in Minnesota. 

you are then suggesting that you would go out and knock on doors of 
public aid recipients in Minnesota and subject them to interviews so you could 
have data in order to estimate health-care costs in Minnesota; that — is that 
right? 

*31 MR. BIERSTEKER: Objection, Your Honor, I believe that question is 
argumentative. 

MR. HAMLIN: I don't believe it's argumentative. Your Honor. 
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THE COURT: No. You may answer that. 

A. I -- I'm not suggesting they go out and knock on doors of public aid 
recipients to bother them. I don’t believe that's the way BRFSS was done, .-.r.c I 
— I don’t believe that’s the only way of — of collecting that kind of 
information. 

"Q.\So you wouldn’t recommend that; would you? 

A. JL would not recommend what? 

*A survey where people go out and knock on doors of public aid recipients 
in 5rde/ to gain information for your — for’ your model. 

'm not necessarily recommending that. You proposed that as a 
Lity. It — 



saying is -t&at 
ding that 
ut you woulkf^~o 
f someone &pop 
od how -- tfver'- 
elp them, 
ut you woul 
ind what h 
Such a surv 
If there in 
on would 
qu^j^-on 
thLuaailv 
The wa 

n. cSfH>ubli 

'm sorry. 

IP^Lt was the 


infoj 

|My 

f aT"' 



what I ’-m recommending. I’m saying no, I’m not 
ticular. 

sider it; right? 

ed to me that that’s a good way to get information who 
the conduct of that — of a survey like that, I would 
t recommending that, no. 
d that helpful; right? 

7 

ion were available from such a survey, then that 
pf ul. 

u, sir, was: Are you suggesting and recommending tV,at a 
o go about estimating health-care costs in Minnesota — 



the 


the 




recipients? 

ay, is that correct? 
s the only way? 


And your question there, is this 




m I saying 
eah. 

eah, it’s ifc> t-e <he only way. 

ow is that |iwaj’ that you would choose because 


you believed that that is 


way? 


tter to collect relevant data than try to piece together 


believe it ifc 

pi^SPB^bf data that fape*] — that are — are less relevant.' So it’s better to have 
more^pHevant data pfaii^ less relevant data. 

& Q^g;And is it better, then, to get that relevant data by taking a survey of 
publCf jUy kid recipients to gain information about them? 

It would be — it would be better to get the information from them. There 
pz ways of — there — 

TfoajL "survey" is a general word. I don’t necessarily mean knocking on doors. 
It wfml|ybe better to get the relevant information from the people in Minnesota 
thanr^xy on kind of synthetic estimation and estimates at synthetic estimation. 

Q. Now are you then suggesting that the public aid recipients be contacted 
by telephone? 

A. I'm not making any specific suggestion of how that information is to be 
gathered. It may be sitting in — in data files already that would have to be 
obtained. You may have to try to do it through telephone survey. I don't — I 
don't know. I'm not making any recommendation that way. I don't know what state 
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?e=e 


that information is in. I'm just making a comment that it’s — it’s better to 
have more relevant information than less relevant information. 

*32 Q. Well my question to you is: If you had the NMES database and the 
BRFSS database in front of you, would you do a further survey in order to gain 
information about the public aid population in Minnesota? 

jr A.\I would certainly recommend considering that, because it’s — it’s more 
relevant information to the Minnesota public aid recipients. 

at the very least you would want to check their records at the agency 
that is responsible for conducting and regulating the public aid program in 
Mi«@pgp^ta; right? 

STaTt ' m not sure. I'm — I'm just — I'm just talking about the information, 
talking about a particular mechanism for gathering that information. 
^^^jDkay. You dorKt^ave any idea how to gather that information; is that 
yofr^Psfstimony? 

LjWwM v testimony^ is^that I know how to design surveys to gather information, 
buftn^'s not thejp-teysqical -- I don’t — I'm not an expert in the physical 
pr o&es jj of gatherirfplsb^formation from data files or from — from people through 
te ffecfflop e surveys or. dcapr — door-to-door surveys. 

p^N^fJell let me this: What are the ways that you can think of as you 

sit here today of d £ ^lll|ig information from the public aid population in 
Mir.nsa§ota if they cLgiLlUy include knocking on doors, calling them by phone, c: 
look^re^t conf iderffillm records kept by public aid agencies? 

&8ESfl >Well yosus — 

*EEl|i:e ar fcT>Mfl vs S look at confidential records that still preserve 
co^^^htiaj. ^^^ ar^Qthe characterization that you're looking at confidential 
re cords mefffiPniat ^ ff fiave accessed the confidential information. But there ars 
doing surv q^^fefr at preserve confidentiality, and so that -- I don't kr.ov 
what^gr what state ^qase — those records are in. I would speak with somebody 
who .tfticlerstood that understood how to preserve confidentiality to help. And 

therMnre committe4ifc.«w3t| preserving confidentiality of the American Statistical 
Asso#iafcion, of 've been a member, so there are issues of how to oreserva 


:hat I know how to design surveys to gather information. 
Leal -- I don’t — I'm not an expert in the physical 
formation from data files or from — from people through 
sr — door-to-door surveys. 

(fou this: What are the ways that you can think of as you 
ig information from the public aid population in 
include knocking on doors, calling them by phone, cr 
records kept by public aid agencies? 


e ari pffi iys " tb look at confidential records that still preserve 
tial jtyl ar fcP%:he characterization that you're looking at confidential 
meehf^^iat fPPKibave accessed the confidential information. But there ar* 
doing surv ^SagjJh at preserve confidentiality, and so that -- I don't kr.ov 
what state %hos% — those records are in. I would speak with somebody 
rstood that .ar # understood how to preserve confidentiality to help. And 
e committe^§S«SM^ preserving confidentiality of the American Statistical 
ion, of wh^efe-l've been a member, so there are issues of how to preserve 
tiality in ^surv^ ys. 

re you her^ilMirc.ifying that the plaintiffs' experts should have 


SF^Are you her 
idpspbd such a s 


that 

publ 


Jell I — as 
3s relevant 


S IX I — a s j| sa id before, it's better to have more relevant information 
relevant b^li#rrnation, and so I — I'm — I'm saying that it would have 
er to haveinformation on Minnesota recipients of public aid than to 
it. 

n you answer my question, sir? ^ 

believe I did. 

question is very simple: Are you suggesting or offering an opinfbn 
plaintiffs' experts in this case should have conducted a survey of the 
d population in Minnesota in order to estimate the health-care costs i: 
Minnesota? 

A. I am saying that the information from such a survey would have been more 
relevant and it would have been better to have that information than not have 
that information. 

MR. HAMLIN: Move to strike, Your Honor, it's non-responsive. 

THE COURT: That answer will;be stricken as non—responsive. 

1 

CJ1 
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Q. Would you answer my question. 

*33 A. I don’t know how else to answer it. 

Q. Sir, are you offering an opinion that the plaintiffs’ experts should hi 
undertaken a survey in order to gain, information about the public aid popular: 
in Minnesota, either by telephone, by knocking on doors, or by looking at 
confidential records? "Yes" or "no." 

(A. Jhe "should" is — there's a problem there. They would have been bette: 
of^tgfgir they done so. I don't know how else to answer it. 

And if you were doing it, that's what you would have done; is that ric: 
would have explored that possibility with people who know more about 
th^dat^bases than I do, and how to gather that information. 

you-gon't know about those databases; right? 
today, no, I don't. 

Sw how to gather that information; right? 
details of how to gather that information, that's 





ot as I sit 1 
nd you don'j 
don't know^th* 



hat's right 
ou really dq, 
id recipien 
That's corre 
Now if you w 
innesota, 
es. 

nd y> oul 


jknow what kind of invasive effect 
Minnesota; right? 


rnat 


may 


:ave on 




frying to estimate the effect of smoking on health-car* 
Iwould also use the Medicaid claims database; right? 

the Blue Cross Blue Shield of Minnesota database; 


alon 


# with ti 
fflrfhat ’: 




co: 


conf 


plai 




looked at the database of the Minnesota Behavioral Risk 
rsonally, no. 

iersonally review the computer data that was submitted 
s' damages experts' reports in this case; correct? 

personally did not. 
lse do that for you; right? 

— I supervised it. 

u agree that regression is a statistical method that is 


orr 

ow have you 
urvey? 

M o. I have .no 
ow you did 
the plai 
s corrept, 
ptou had some 
iMfhat's corre 
ow sir, wou 
used? 
es. 

ould you agree that regression can be used for controlling potential 
‘tiers? 

t can be used, sometimes with less success than other — other times 
ould you agree that Drs. Zeger, Wyant and Miller used regression in t 
* mode 1 ? 
y used it, yes 

Q. Now would you agree that maximum likelihood estimators are statistical 
or are a statistical method that is commonly used? 

A. Less commonly used than regression methods, but yes, commonly used. 

Q. I mean you’ve used the principle of maximum likelihood; right? 

A. Yes. 

Q. And Drs. Zeger, Wyant and, Miller also used that principle; right? 
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A. That's correct. 

Q. Now let me ask you this: You're familiar with the term probit model; 
right? 

A. Yes, I am. 

Q. That's a statistical method for predicting probability; right? 
fes. 

fact, you've used probit models; right? 

Fes. Once in a while I have, yes. 

f. And other statisticians, to your knowledge, have used probit models; 


rig 


stu 




mode 


advo 



es. 

nd you agree that probit models are used or can be used in health-care 
to estimate rb he p robability of using a medical service; right? 

A. It can b %a# Pi|j to try to estimate that. There's implication that it’s 
ping to turifjout right, which is not true. You can misapply maximum 
d, you can pfaa«6^3ply a probit model, but yes, they’ve been used, 
ey can be i im fny that purpose; correct? 
es. 

nd have beex^ d for that purpose; right? 
es. 

Drs. Zeger, and Miller have used probit models in their statistical 

rect? 
mas 

t? 
ept. 

eator of the theory of multiple imputation; right? 
ct your attention, sir, to — 

his first: Isn't it true, sir, that you’ve been 
s this theory of multiple imputation? 
ts, yes. Not — not as the only way to address problems 
-Response, but I've been advocating it in certain 
xnk it is the best method to use. 
vocating that through published literature; right? 
nces where I think it works, that’s correct, 
ted it through your speaking engagements; correct? 
t. 




tow I want t 
let me as.k 
ng for two 
r n certain cc^a&a, 
ng data and 
nces where 
d you' ve b' 

— in cir 
u've also 
es, that's 



>w let me ask you about — 

\y I correct that? 
g- missing data. 

I adjust that answer a little bit? Because there have been speaking 
is where I've been asked to in effect advocate where I’ve indicated 
i't think it’s the right thing to do. So it's not that I always 
advocate, I advocate when it's the appropriate thing to do and say no, it's not 
appropriate in other cases. 

Q. Well you've published a great deal on multiple imputation; right? 

A. Yes, I have. 

Q. And when you talked about that body of literature, you were talking about 
-- in large part about the literature that you published, right? 
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A. I don’t think that's a fair characterization. Maybe the theoretical 
literature in earlier years, that was dominated by me, but there are other 
publications by other people. In fact I saw a recent publication that has 
"multiple imputation" in the title, in fact an article in Biometrics, that 
didn't even refer to me in the article, not even the reference list. 

W" Q.\Well let me ask you about public data sets that have missing data. Thai 
notanlunusual phenomenon; right? 

^filiFNo, absolutely not. It’s common. 

f~ Q. ^ I mean it's -- it’s common in, for example, the u.S. census; right? 
pi!iffl|Absoluteiy. 

It's common in the Fatal Accident Reporting System of the Department c: 
Station; ri-< 

( Correct. 

Now with re; 
ier than mu. 

Fes, it has. 

Jould you there are other statistical methods besides multipl-- 

Sion that are ^con iionlv used to impute missing data? 

Chat are cor jmcpfy used, yes. Well more so before than now, but yes, ther; 
are other methods. pWWM| 



wa 


in 




to the U.S. census, that has handled missing data in 
imputation; right? 


arti 


Q. I mean 
s. That d 
• HAMI#ffl*$ MO 
d m 

t ™AT m You^r ayyte th 
^rjjgnot Surfe w 
^ ^^ The methods 
WFror imputing' 
- for impu 
Correct. Th 
In fact the 
by you in 
I said I be 
ut that wa 
think tha 
§%s8# And prior t 






methods have been around for a long time; right? 

make them correct, 
strike. 

pds have been around for a long time; correct? 
ods that are now -- that are still being — 
u mean. The methods are still being used? 
than multiple imputation — 


data have been around for a long time; correct? 
back to the pre-computer days, that’s correct. 

time imputation had been written about was in an 
correct? 

’76, but that’s about — I won’t quibble, 
first time; right? 
ght, yes. 

time statisticians were filling in missing data using 



otheggatatistical methods; right? 

CiuD Yes, that's correct. Or they were doing other things to handle — I — I 
don w ant to mis — mischaracterize what statisticians were doing. They were 
som«pi^@g|es filling in missing data for item non-response, but the standard merhoc 
and .s afes. 11 the standard method and not a bad method typically for handling unit 
non*|g|gjgc|pnse is to do weight adjustments. So I didn’t want to leave an 
impression from my answer that whenever statisticians saw missing data, they 
automatically ran to imputations. They did not. With unit non- response, they 
typically went to weighting adjustments. 

Q. So weighting adjustments have been around for a long time; right? 

A. Yes, that's correct. 

Q. Prior to multiple imputation; correct? 

A. Correct. And for that problem it is probably — for that problem of unit 
Copr. © West 1998 No Claim to Orig. U.S. Govt. Works 


http://legacy.library.ucsf.efiU>tiGt/pob|Q^alM>i^tf#.industrydocuments.ucsf!edu/docs/xygl0001 


51956 9840 



1998 WL 212591 


non-respon.se, it very often is a — is a better thing to do. 

Q. And you're familiar with the procedure known as hot deck; is that right' 

A. Yes. 

Q. And the hot deck procedure is also used for imputation purposes and has 
been around for long — for a long time; right? 

T"A.\Yes, that's been — that goes back to the beginning of the century with 
HojLler||t cards, IBM cards, that's where the name comes from. It’s pre-computer 

p" Q. In fact it's still a common practice, right? 

jM^I lt ' s becoming less so as more and more problems are revealed with the 
method. It’s still used. 

jggO^jgllt is stillrtused; correct? 

jT2|\|Yes. Yes, iFN^y 

fact NMEfepirefls that procedure; right? 

l^^g&Yes, it doe£* -w 

PqTJsow you menfi^fnep weighting. 

LjfPlCorrect.? ^ 

that is a Scfi$0op method in major surveys for handling unit non¬ 
response; right? 

^Correct. hmsmd 

fj^r^iANES uses rnpnbight? 

^£Z3 ln f ajafimj youjpiink it's a fine thing to do; right? 

pr T7~ 'or. i In it nonresponse of a particular type, it's a — it's a fine thing 


that procedure; right? 
weighting. 


method in major surveys for handling unit ncn- 



Now the NHA| 
home model; 
Correct. 

Now is it y| 
k that peop| 
A. Basicalj 
one value f 
e, but the | 
r example, 8 
ibility fori 


bright? 

hd^ik it's a fine thing to do; right? 

|^ponse of a particular type, it's a — it's a fine thing 

i^ata set was the data set used by the plaintiffs in the 
prrect? 


jP^Now is it y^np^eiestimony that before multiple imputation came along, all 
the jyoj| k that peop£e™were doing to impute data was wrong? 

A. Basicallvam the — all the — all the work chat was being cone re 
fil#^¥!| one value not talking about weighting adjustments for unit non- 

resj^f^e, but the imputation stuff, led to invalid inferences, and that's 

whpjpjS^r example, ftafTcf nal National Center for Health Statistics who has the 
resaafieibilitv forppiPigfueing NHANES is no longer doing it. They are not 
releg^ng any files with — with those imputations because they've come to the 
coni j ^u iion that it's invalid, and they're releasing the new files with — with 
withsTmultiple imputations. 

•..i-^Sir, can you answer my question? 

I thought I did. 

WirilirVP it your testimony that before multiple imputation came along, all the 
worPTnfr people were doing to impute data was wrong? 

A. It — it was — it was wrong. The — how wrong it was depends upon the 
particular application and the particular method. 

Q. But you're the one who got it right; correct? 

A. With respect to — to single imputation, people knew that it created 
problems. 

Let me — let me clarify that. There are different kinds of methods for 
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trying to deal with the increased variability that comes from single imputatir: 
and the word "methods," based on splitting samples, balanced-half replication, 
redoing imputations in jack-knifed or bootstrapped samples to try to properly 
reflect the uncertainty. So it's not that everybody was doing it wrong, but 
these — but they were — it was very cumbersome. And I believe that many peccl 
we^ d&Lng it wrong, and that multiple imputation seems to be — have — is -- 
fo l, thaJ L problem appears to be superior to the other methods. 

^Sffl consequences of doing something wrong aren't always clear. It depends 
up jin how much information is missing and how carefully the imputed values are 
crfP§3K|> There's large literature on that. 

&-Q. ^But you believe that your theory of multiple imputation is far superirr 
Jther methO' 

;orrect? 

[t depends 
ton, and fo 
^plaintiffs 
that that' 
it there's 1 
literature 
proposed and used, 
certain circumstan 
addrfres^g that pr 
)o yof^Bfiin 



of multiple imputation that was used before your theory carr.e 

he context. I think it's far superior to single 
ng about the fact that you single — singly imputed the 
ut for many years survey organizations have realized 
valid and they've tried other methods to reflect the 
jfcsed uncertainty due to imputation. So there’s — there's 
here are lot of methods that are — that have been 


D 





thi 
eir single 
a few thin' 
Q. Well le 
Can you giv> 





as the tab number again? I apologize. 


ere has been awareness of the problem. I think in 
at multiple imputation is — is a superior approach tc 
than these other previous methods, 
the authors of the NMES survey were wrong and are wrcr 
i n theji&Kgf: dejgk procedure as opposed to your theory of multiple 

imputations CTLj 

hihkthatp^iiip^hat there are p ro bi ems created by their sequential hct 
:ation, filling in one value, single imputation hot deci 
read from them, I think they agree. 

iirect your attention, sir, to Trial Exhibit AZ8906, 
hint on how to find it? 

fes. It's irf“yet*r large notebook, your literature notebook, title? 

^That ’ s it. 

>kay. And wmrc^Tai 
A 28906. 

6S t 

%^p #you have the 
feYes. 

See the title of this article is "ANALYSIS OF THE EFFECTS OF IMPUTATION 
ON I fkRIft NCE ESTIMATES FOR THE NATIONAL MEDICAL EXPENDITURE SURVEY," do you see 

thatpass^ 

see the title and the author. Is there a publication in here? 
asking if you see the title, sir. 

T. y4s, I see the title. 

Q. And the author is John Paul Sommers, Agency for Health Care Policy and 
Research. Do you see that? 

A. Yes, I do. 

Q. And he is in fact in charge of the NMES survey; right? 

A. I don't know. 

Q. You don't know Mr. Sommers? 
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A. No, I don' 


Q. But the Agency for Health Care Policy and Research is responsible 


:cr toe 


NMES survey; correct? 

A. Correct. 

Q. And that's a government agency; right? 
y*-A.\Yes, it is. 

I MRjJ HAMLIN; Your Honor, we offer — 

^gm^Jell let me ask you another question. You realize that this article was 
anperticle that you designated for your trial testimony here today? 
flas this one or — 

„The?re was another one by — 

'm asking you, sir, about this one. 
ind I'm sayinW^Iim not sure whether — 

link there w#j§iW:wo by the same author, and I — I can't be positive if 
^ na was desrtjnaj^ed by me. My memory is that it was designated by the 

deposition that you took in Minnesota, but I — I could 



fs in — i 
et me ask 



is; You reviewed this article prior to your testimony; 
t. Your Honor, we offer Trial Exhibit AZ8906 as a 


bjection, Your Honor. 
1 receive AZ008906. 




Yes, 1 did. 

HAMLIN: Ai; 

^t document 

COURljiZpou; 
hamljB . , 

_. if vreRrmild prtteSff the first page. 

that the tiuuLe ^ f the article is "ANALYSIS OF THE EFFECTS OF IMPUTATION 
ON V^^DUS ESTIMATE^^m THE NATIONAL MEDICAL EXPENDITURE SURVEY.*' 

yb^Let me direc^^e^ir attention to the second column on the first page. Do 
you ^Sje^that? 

jP^jChe first p^HP$lph states, "However, there are two potential problems 
with^ade multiple i|g|£||ii|ation process. 

It is more ®ffricult and expensive. It requires several imputations be 
.ch can be on a large scale data set, such as, those produced 

government surveys. It can be more complex than use of simpler common 
les, such as, sequential hot decking." Do you see that? 
see that. 

ind again, the Agency for Health Care Policy and Research is responsible 
NMES survey and thus it makes decisions about what process to use to 
>ta; right? 

’assume they do, yes. 

Q. Okay. 

THE COURT; Counsel. 

MR. HAMLIN: Yes, Your Honor. That's fine, we can break. 

*38 THE COURT: We'll recess, reconvene at 2:00 o'clock. 

THE CLERK: Court stands in recess to reconvene at 2:00 o'clock. 

(Recess taken.) 
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Minnesota v. Philip Morris, Inc. 
Minn.Dist.Trans., 1998 
END OF DOCUMENT 
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YEAI^^^ May 1, 1998 

P.M. Session 

<50^0* Hon.! 

* 


JUDi 
TEXT,: 


ie Kenneth J. Fitzpatrick, Chief Judge 


AFTE f SESSION. 



|LERK: All r 
enters the 
THE CLERK: Pleas 
T HfeaS QURT: Couns 
ME fflH jlML I N; X han 
GqprP |af ter TOph . 
rt®rectivf^%oo 
BY MR. HAMLIi&^Wl 





ourt is again in session. 

room. ) 

eated. 

Your Honor. 


ernoon.") 
fessor Rubin. 


fore the break we were talking with NMES and the 
dures. Do you recall that? 


d afternoor 
d afternoor 
ofessor Rubir 
cnoicet Q § imputation 
A.i&Yes, \ do. 

were the a ? utho1rs of NMES wrong in choosing the hot deck procedure as 
oppose fetjb your theoi |^ iaa#multiple imputation? 

A.Sfcpy were w ror£ILbelieve, in using that procedure rather than a number 
rocedures tP^^rould use multiple imputation to handle the item non- 



1 let me a^tT*yo3 this: You said that you had done some consulting work 
with tMiPkpeople responsible for the NHANES survey; is that right? 

A.^^fat’s correct. 

Q. fNow you have advised them that they ought to use multiple imputation; 
right?P^ 

A.|^lf.l it — it started out as a — as advice to consider the use of it in 
a — if^Wlong — in a series of studies to see whether it — it would improve 
and was worth the effort. So the initial advice was just to consider the use of 
it. 

Q. Now the authors have not yet made a decision whether to use multiple 
imputation; is that right? 

A. That's incorrect. They have decided to use it. 

Q. Have they actually begun to, use it? 
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A. Yes. My understanding -- I just spoke to them a few weeks ago, and the 
currently-released public use tape of NHANES has no imputations at all because 
they don't — they don’t believe the old way of doing it was valid from their 
studies over several years, and the new NHANES that I'm told will be released 
within a month or two by them I was told will contain multiple imputations for ’ 
all the major variables that had missing values. 

me ask you this: All the existing NHANES data now uses — strike 

jJuTl the NHANES data that is currently available to the public does not use 
^ m P uCat 3- on '' right? 

ffimofhe data set, that's correct. It has whatever they did before. And 

thejSurrent one has no imputations in it at all. 


OT^jgPrhd the curr 0 nx-j y 
f^gj^-iat there i^pi^lat 
multiple 1 imputation .freight? 


-j yeah, right. 

imata set for NHANES available to the public that uses 


I'ljflrjust clarif|yggy| 
broa#t^Padvertised asbe 

phaUdf NHANES 
somebody' who wanted jfeh.es> 
Q^Do you know a ^yEo 
Au^fae people atfSlpS 
to 

imp roIn P re nt o ffr thel^yf-er 
apinrrs f roiP%h 
data tape u^#^|iult^^i 
fatepifflfc one outsicle~~th 
that Mk-«a&>rrect. As fll p ^ 
don ' t. I know thSgiS 
Q fl ifcw I believefSl&u, 


imputation; right? 

doesn't us'e multiple imputation in all of its surveys, but it — it 
does it in some. fappr 

those surv ^^^^ ere it does not use multiple imputation, are the 

IrSjfne sense thatare they doing something that's — that can lead to 
inval gSr ^nferences and wrong answers? Yes. 

Qjf^ow are there any other government surveys that you can think of that 
don't Sx&e multiple imputation and therefore are wrong? 

A S21r ny of them — 

M fUtf Q f them do not use it. Most of them use methods that were available anc 
devellp^Nbefore there were computers, and if — they're — they're agencies, 
they’re bureaucracies, and some of them are moving towards using it as they're 
convinced it's -- it's worth the effort to get valid answers relative to the 
answers they were getting before. 

Q. So are the answers that they got before in the government surveys 
invalid? 

A. They may well be. Depends upon how much missing data there — there was, 
Copr. © West 1998 No Claim to Orig. U.S. Govt. Works 


j£ "a rtat * s correct. 

Hsilli ' s not a P ut> lic-use data set in the sense of being 
soelng available, but I understand approximate — the first 
^^Pbeen multiply imputed and could be made available to 
yAjlgjt^ At least that's what I was told. 

S y who has made use of them? 

ES have and -- and people involved in various projects 
e multiple imputations that were done are a substantial 
methods. 

^%hose people, no one else has made use of that particular 
imputations; right? 

e the NHANES people and people involved in the project, 
lyil I know. I don't -- actually I should say I really 
hojgl&e-people have, and I don't know whether anyone else has. 
lllaUJtestified that the U.S. census doesn't use multioie 


but it — it 
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ar 

£ ’ 


and — 

Yes, it's — that's right. The Federal Reserve moved the survey of consumer 
finances to be multiply imputed years ago because they -- they drew that 
conclusion. 

Q. So if a government survey doesn't have multiple imputation, chances are ' 
very good that the results are invalid; right? 

XT l\didn't say that. I said there's — there -- the chances are that they 
— t|yy^|t can be. It depends upon how much missing data there is. If it -- if 
there^Hre very few missing values, one percent of the values are missing, then 
it pjEgbably doesn't make much difference. 

pP®!§in you name one government survey that doesn't use multiple imputations 
that^youxthink is valid? 


that“youxthink is vali 
i$l^j$$t!at doesn't 
Wigjfe| they can bej 
gene rally valid. jp 
jfffllNik., can you wa 
imputatijfh that you (go 
ipi^iPwant to try^ 
usua ^^d escribed as M 
thatHarePobtained frlgi 
valid. If — ?*"' 

* not rea -^ 
not. Wat% not tha 





not. ‘ Fnat % 
jarg or^^ 

youffireftn 

survey did 


. for certain questions. It's just -- but they're not 

ne government survey that doesn't use multiple 
er to be valid? 

arify, if I — if I can, that — that a survey is not 
or invalid. It*s questions that are asked, answers 
analysis of the survey data that are valid and not 

rect to talk about the whole survey as being valid or 
not the correct — correct use of the statistical 


imputa 
have 1 
CM 

inf orri$ 

Sd 


.s. Many sur veys 
h if the quJtfl 
e fractions o fca 
Sn you name amt , 
re. NHANES dan 
bn on them. ■ 
je you talkir ktesaK 
|s. If you — Fsol 
pf the questflPiP 
h answers tfcyfijL 
fact that trfPPn 


Pfe government survey where the answers are valid where that 
gyg||jg|.e imputation? 

^ysp an yield valid answers without using multiple 
IglPPins asked of them do not involve factors/variables that 
ofgpissing information. 

PT /g pvernntent survey, sir? 

an - - has many questions that have very little missing 


• pSap you talkiraj ^ite^ but the data that's publicly available now? 

. pfeal s. If you —TjTome of the — 

ggEsOf the questfU^PPf have very little missing information, and if you're 
askitffPrfxr answers tj^^^yldress those variables that don't have missing data, 
then traP%act that tray aid something that's not proper with the missing data 
doesn'jgWfeffect the validity of the answers. 

QNHANES is one; right? 

A. certain — certain — certain questions asked of NHANES, if those 
questit m jP do not involve data that have been improperly imputed, that's correct. 
But thtfMS®^- and that general answer goes for all government surveys. 

Q.^6Mfou know of answers or questions in NHANES where the information was 
not improperly imputed? 

A. Questions that are basically completely observed, yes. So -- so questions 
involving variables in NHANES that are not subject to missing data, that don't 
have missing data, questions involving those variables will be — will — will 
yield valid answers. 

Q. Can you give us — 


A. certain — certain — certain questions asked of NHANES, if those 

do not involve data that have been improperly imputed, that's correct. 
thJ^- and that general answer goes for all government surveys. 

Q.^6&fou know of answers or questions in NHANES where the information was 
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Just give me an example. 

A. In NHANES, I believe there's a -- an interview before they go in and get 
the three major health components at the stand, I think self-reported blood 
pressure, self-reported height, self.- reported weight, I think self-reported 
income. Is that NHANES? I think it is. Race. I think those variables are 
basically complete in NHANES except for unit non-response, which is accounted 
for^y Weighting adjustments. 

if you have all the information for a variable, then the answer is 
valien^right? 

SK s, because it doesn't depend upon the imputations. 

t if you are missing any information and you don't use multiple 
n, then in your view it's most likely that the answer is invalid; 

at's not whs^pHisaid, 
you agree pTth that? 

, because ypujfttt in adjectives that make it incorrect. 

11 can you of a question -- an answer where missing values are 

n NHANES whlfreiraultiple imputation was not used and you consider the 
o be valid?|l^$P 

would have ^^^^y to think of the variable. You said now it has some 
imputations, but veryrew, so it's, for example, 95 percent observed, but there 


Qsrwn«t d o yo 

AiP^&p s e r v 
S6TV 
A. I'm 


?! 

likely 
will a 


Q 


e you sayin 
Yes, o-b-s 
1 right. 

as opposed 
1 right. 

the ninety’— 9|5 percent observed, five percent missing, then it's 
Uiat the -- thg^n l^ ck of representing uncertainty in that five percent 
iky^have a ma j o ]T°cx>n sequence. 

£”can you give example, a specific concrete example? 

¥**1 1 had a co feadLai l all these research papers that actually are referred 
to indocument we^rl^looking at, for example, by these people at NHANES, it 
would ^^ very easy to because we did those kind of studies to find out when it 
made s^gHfference and when it did not. I do not have a list of variables in my 
mind now that 1 can refer to * but we could refer to any of the documents 

— doJWnts referred to here by the people at NCHS, and you could see the 
studi ^P^ha t show how, for certain questions, multiple imputation is very 
importpsaa^Mto do, for other kinds of questions it's not. So you can get those 
documents and I could ~ I can address them. I don't feel it’s appropriate for 
me just to speculate from memory when there's hard information that's in the 
published literature. 

Q. Okay. So you can't recall from, as you sit here today, one question or 
one variable where there was an imputation of data without the use of multiple 
imputation that you would consider to be valid; right? I mean you'd need — 
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A. I'm there. 

Q. That's "On Variance Estimation With Imputed Survey Data;” correct? 

A. Correct. 

Q. And this, too, is part of that debate; right? 

A. It's part of the package. I wouldn't necessarily call it a debate. 

11 right. Discussion? 
s. 

ay. Then let me direct your attention to what's called a "Comment,” 

whi SlTTs 

at page 50? by David R. Judkins. 
p*lw what’s — what's a comment? 

a. comment is someone who's invited to be a discussant of typically cr.e 
arti|$ll|iF;but in thishc ase , because the editors of the journal thought it was 
it very import||iWfopic, this idea of handling missing data in surveys and 

multip le ~ imputation fiis becoming something of a standard in some sense, they 
thoiJlfwlP|fc. was impoo invite more than one article and have more than cr.e 
discussant discuss tggngM|ckage of the articles. So the first three articles you 
rr.entjifejjgpl, by — by $e7j|y Bob Fay and by John Rao, were the three articles, and 
ther «M e wer e seve^^^Pxscussants who were asked to lend perspective to the 
three--^three artici*># * hat wouldn't be apparent if only the three authors were 

writil Ri ii 

there's a|aS#li£i|r comment by David Binder, looks like it's at page 510. 
s t 

Q. And t^ W ^herfurther comment by John Eltinge at page 513; right? 
jMM gj rrect. t''"''''"! 

o I«JuI d then therlipiP% rejoinder by you at page 515. 

A.3errect. 

Q.^j^ght? |Cj 

RAMLIN: Your [Hono r. at this time plaintiffs offer Trial Exhibit A7CC025* 
as a i*$8i^ned treatise. 

Mffe^^IERSTEKER*. ttesfsaferiection. Your Honor. 

T Hpb URT; Court [wilL receive AT000254. 

BY M^JiAMLIN: 

n!|w in additidjBj &pi your rejoinder, there's also rejoinders by Robert Fay 
and by^Sc. Rao; rightrT"^ 0 ^ 

A.^^&rrect. 

Q.^STd these were the principal authors, and then they get basically a 
chance ff ta comment on the comments; right? 

A.W nl actly. 

Q.^P|ay. All right. Can you turn to Robert Fay's rejoinder which is at page 
517. piiiiil 
A. Yes. 

Q. If we could have that on the overhead. 

Now if you could turn the — this is on the overhead, and it's "Robert E. 
Fay" on the left and the title of this is "Rejoinder;" correct? 

A. Correct. 

Q. Now if you could turn the page, page 518. 
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Page 


the first full 


Oi * 6 A ' Yes * 

V. ’ Q. I want to direct your attention to the second column, 
paragraph. Do you see that? 

A. Yes. 

The debate-..?" 

Yes. 

Spates that "The debate over the suitability of MI" — 
refers to multiple imputation; right? 

'Correct. 

says, "The debate over the suitability of MI to complex samples must, 
sider, be one of the most confusing in our literature. It is not hard 
pplications. of MI to complex samples that make general but vague claims 
est that tha ^p^i lication is proper, but the evidence has generally bear. 
Do you see| 
ee that. 



Elti 


me direct 
pight? 

rS * 

1 s an assoc 
is that right? 

A .^That ’ s correc 
O^P^st want t 
speciricarly - s«&p ec 
A.fflb . 

0™wuuiber 
A. Yes. 

MULTIPLE 
A.lgprrect. 

one of your 
a cont^grersy here; 





A. pT^ t ’ s what h€p=Alu.s 
those TSiiSlngs are controversial. 


attention now to page 513. This is the comment by John 


rofessor at the department of statistics at Texas 


ct your attention to the second full paragraph and 
ly the title of that paragraph. Do you see that? 


ATION CONTROVERSY." 

eagues calls this multiple imputation theory of yours 
it, yes. It's a new idea, relatively new, ar.d often 



h. You autl 
thored tha 
first tiro 
that was 2 
. But the 



cle for the first time in 1976; right? 
de a proposal, yes. 

s before Mr. — or Dr. Eltinge’s comment; right? 
t is that the standard sample survey world goes back 
written by Jerzy Neyman in 1934, so with respect to 1934, 1976 is 
recent. 
the — 

history of statistics, 20 years is not a long time; right? 

;he history of trying to do data analysis, that's — 
fsorry? 

A. In the — in the history of trying to do statistical things, 20 years is 
not a terribly long time, that's correct. And also because it takes advantage of 
modern computing to do a lot of the — a lot of the work. 

Q. Hell that's an interesting point. Would you agree, sir, that appropriate 
software for multiple imputation work is still badly needed? 

A. I think I wrote that when I, wrote this article, and maybe — maybe in 
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c 


another article as well, but the situation is improving. 

Q. Yeah. 

A. The fact that NHANES did it and could do it in a -- in a major goverr.me.-.: 
survey indicates that. The fact that survey of consumer finances did it years 
ago indicates that. The fact that the Department of Transportation did it 
indicates that. Situation is improving. 

S\r, you have made the statement that appropriate software for multiple 
?n is badly needed; right? 

ien I wrote that, that is especially true, and it's less true now. 

3 d you wrote that only about two years ago; right? 

I wrote that — I mean it was published about two years ago. 

Two years ago. Okay. 

St the typic^^delay from the time an article is submitted to the time 
ilished is pEgp8B|.y two years. I think maybe at the bottom here it says 
Joint -- som^mes they show when an article was submitted, and we 


impu 





d it if yo 
me ask yo 
eneral in, 
revious rep 
ay. 

Now those stu 

not neces 
t is IpS^& rre 

Now 

if f s' mode 
1 in a gen 
done iSpEhe current 
the plS ||jft :if f s 1 mode 
Q-^Arvd you tea 11 
damage^ijgggistimate of 

Q.pjo|py. Sir, I 

* re the cr 

right?' 






the 



to find when I actually wrote the words. 

: Assume that the epidemiological studies ci* 
say the 1989 report, which is basically a summary zz 
do not use multiple imputation. Assume that. 

re not unreliable solely because of that fact; 

, that’s correct, 
aht? 

rself done any imputations in the data sets used in 
rect? 

ort of sense, I — I participated in the imputations 
, but certainly not the data sets that were used in 
se, that’s correct, I have not. 

t know what effect doing imputations would have or. th* 
illion in the plaintiffs' model; correct? 

o ask you about propensity scoring. 

of propensity scoring along with Paul Rosenbaum; 


A.^SP&at’s correct. 

was that creation actually published in an article in approximately 

1977? ft,, 11 * 

A.FffiSl I believe it was 1983. 

Q-#®%ry. 

Do|p§®ia^recall where that was published? 

A. I believe it was in Biometrica. 

Q. And so would you call yourself, then, the co-creator of propensity 
scoring? 

A. I think that's accurate. 

Q. Now if the Surgeon General relied on epidemiological studies that failed 
to use propensity scoring, is it your opinion that all those studies are invali: 
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and worthless? 

A. No, not necessarily. 

Q. Do you know of any studies on smoking and disease that used propensity 
scoring? 

A. Smoking and disease. I don't believe so. 

Q. Other than this case, have you attempted to calculate propensity scores 
for ^IT s^V>dy of smoking and disease? 



SCO 


in 



ow does the Agency for Health Care Policy and Research use propensity 
analyzing the NMES data? 
don’t know. 

t me direct your attention to your deposition, which is in the folder 
of you, spe^fic.ally your deposition of October 7, 1997. 

ay. Yes, I hggp^lt. 

you recallpgiving your deposition on October 7, 1977 in this case? 
ecall givin g af ^deposition. I assume that that date's right. 




you recall 
bsolutely. 

ltd you did ta^^ihe 
isolutely. LJELi 
Let me direct Fy2r®i 


you were sworn to tell the truth? 



truth; right? 
attention to page 363. 


[ntjion to line two. 

ES use propensity scoring? 
don't think so." 

te at the time you gave it; right? 

his, and this may not be correct, that this was in the 
ing about the creation of the database and imputation 
I should have it read back — was — had to do with 
lysis of data, and I don't know what the agency does 
data after they produce the public-use file. I 
- was in the — in the context of imputation, and I 
t my memory is. 

to read the question again? It says, "Now, does NMES 
nd your answer was "I certainly’don't think so." Was 



& ** #&,* * . My memor 
contehen you wer 
in you* question — 
does ils LJ se it for t 
when t&ev analyze th 
believ^ggphis context 
think ^feet's — that 
Q IZlIl r, do you w 
use p|||||nsity scori 
thatn||?c8rate or not 
A.^^think — 

I read that accurately? 

A.^^rbelieve you read that accurately, but I'm saying there's a context, 
which ^ be lieve -- and my memory in this whole question, this was about using it 
for iupTOation, I believe. Maybe that's not correct. And what I was 
disti nguish ing in your question as you asked here was does — does the agency 
use itfre&fe|heir analysis of that data set, and I — and I — I don't know about 
what the agency does currently in analyzing their own data set. I certainly 
agree now that -- that the — I — I really doubt that they use it for 
imputation or for handling missing data. And I think that was the context that 
was being developed here. 

Q. Do you know if the agency has ever used propensity scoring in analyzing 
its data? 
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for 




I don’t know. 

Now you know what the current population survey is; right? 

Yes . 

It's a large U.5. government survey; right? 

Yes. 

ots of data; right? 
s. 

you know if the government -- government analysts for the current 
ion survey use propensity scoring in analyzing data? 

ain I just want to clarify the question. Is this with respect to 
ata problems in the database, or once the database has been created, d; 
o people use propensity scoring methods — methods to analyze the d=z=: 
they use ilfrx at. ail, sir. 

is is whom?y^| is the agency — the Census Bureau? 

Talking about^he current population survey, sir. 
know you — 

w which sur|^^|ou're talking about. I want to know who "they" are ir. 
tion. 

current population survey, 
f the — of the — of the Census Bureau, 
hey do. I would doubt it. 

ou know whether the people in Minnesota responsible 
ctor Survey use propensity scores? 
would doubt it. 

government surveys that use propensity scores? 
ce — I — 

question, because it -- it — a survey doesn't use 
the people who develop the survey at the agency, do 
that a fair recast? 
cast. 

the question again with that — I — 
m answering the question correctly so I don’t run ir 


you 

ie authors o 
lis is the s 
t know whe 
right. No 
fhav iora l R 
sdon ’ ^mow, 
you Hw o 
A. That 

1 1 understa 
prcpeiC ^J y scores. I 
they u»e\it? Is that 
That's a f 
A^O^ay. Could y$ 



I to make s 

this EpaMlem where I 



it's in one text and it’s in something else. 


& 


Q- pfifl fenple — simi|JLg^||iestion, sir. 

you know ojfeg^jg^J government surveys where the agency itself uses 
iy scoring? r . ™ 

.1 I believe people at various agencies use propensity scoring methods 
databases that they've put together. One example would be GAO. The 
Accounting Office used propensity scoring methods to look at that 
Another example is the NIH, where a group of — of — of doctors used 
scoring methods to analyze data they put together on the effect of 
sficient baby formula maybe eight years ago. There — there are — 
there are applications like that done by people at federal agencies who want to 
address questions of the effect of exposures. That's different from using the 
methods to — to create imputations or to fill in missing data. I'm just trying 
to be clear. 

Q. All right. Now with respect to using propensity scoring to fill in 
missing data, do you know of agencies that use propensity scoring for that 
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o 




purpose? 

A. Well I would not regard it as generally very appropriate in most cases, 
but there have been applications. Internal Revenue Service. 

Q. Sir, have you found articles in the reported literature that frustrate 
you because the author uses regression analysis in comparing two groups without 
doing propensity scores? 

?A. fttell yes, in — in the sense that — and this is from my last deposition 
at kyy$ 2 ry end -- that you've devoted your professional life to certain topics 
to trPnto imDrove science, it can be somewhat frustrating to see a relatively 
sloayirrjpvement to using better methods, whether they're used by — created by 
me fPnrrai:, than you'd like to see happen. 

But specifically you've seen articles that were published in, you know. 



le journals! 
nd you fou 
, not — n 
ve seen th| 
re there ar 
hat's frust 
le places b 
and not he 
mean you'ref'fru 
epted at t 
t)h^ no ' s J19 t a 
t be #piPpie 
whiJP^ou 
I *3f*ftH|the 
have done 
ion. 

. Now you as 



e regression analysis was used without propensity 
s to be frustrating; right? 
the way you characterize it that way. I’m saying that I've 
js^that are written in journals, in fact you just quoted 
“tements that I regard as being incorrect, and that’s 
g to see things that are -- that are quoted in 
pectable people that — that I regard as - - as 
to the future of the field of statistics, 
itrated that your theory of propensity scoring isn't 
int; right? 

. And it’s been — it's been very widely accepted. I -- 
the progress that we're making. I'm just saying that 
at applications that are making mistakes and you just 
d do it more correctly, not using propensity scores, 
ter, because they could have gotten a better answer rc 

ditor of journals have seen articles or commented or. 
was done without propensity scoring. 





$ 


/here regrej 

A.afcXbsolutely. _ 

QjQllgid those articles^ -- 

Aaccepted h*:?lf^:les like that as an editor. 

Q.- Jkjj d published right? 

Arfgs, because f^iPIPfight they were important, whether or not they — they 
usedFTpSrlgs that I 1 happen to think are - - are better. 

Q.^mw I take it TOlrP you think that the Zeger model is unacceptable ar.d 
inva 1 jiSP^r ight ? 

A.^®ffat's correct. 

Q.fOKay. Isn’t it true that you don't know whether any statistical analysis 
can beWSed to compare the smokers and the non-smokers in NMES in a way that 
would |^|id valid results? 

A.]pl^as|ven't done that investigation completely, but I — I believe there 
are methods. 

Q. Well don't you believe that it might be impossible to compare smokers ar.c 
non-smokers in NMES in a way that would yield reliable and valid results? 

A. It might be, but from the examinations that I've — I have done, I don't 
believe it's impossible. 

Q. You would believe that it's_impossible? 
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A. No, I do not believe it’s impossible. I think it has to be done a lot 
more carefully than it has been done. 

Q. Well you have not taken that next step to actually do such an adjustr.sr.:, 

That’s correct. 

And I — . 

t true that you believe that Zeger's regressions could grossly 
ect or grossly undercorrect for bias in the subgroup of males 19 to 2-.1 
es, there — the 

Zeger et al report’s regressions could grossly overcorrect or grossly 
rect. You just don't know, 
ut you just don’t know; right? 
e don ’ t kno'Ks y , 
hat’s just wheiitii/ou said; right? 
es. 

u don’t kn<| 

don’t know. "We” "we" collectively in the sense if 
hey produce and we -- we don’t know what they mean, 

— completely biased one way or another, 
e today, you can't tell us whether the Zeger model 
rossly undercorrects; right? 



an 



think I sa 
at the ans 
|ld be compl 
id as you s 
grossly overcorrect 
s corre 
do 

/yP%’ve hi 
medffisb&mliter 
wrong. I ju 


A^Sphat’ 



k 
o 
re. 

I 

also has 
11 accept 
you know 
re only fiv 
have no id 
the article tha* 

ccmpl efeats at that poij 
O g you know 

S ou aware w 
shed in 199| 
.1 — I ha' 



tner 




at Medline is? 

I believe it's a way to access publications in the 
elieve that's right. On- line access. But I could be 
rd of it. 

ific literature as well; right? 
if that’s correct. I -- I really don’t knew, 
edline search of articles in 1997 would reveal 
cles published on multiple imputation? 

— 1 have about 20 or — or so applications in 
asked me to look at, so I guess that it's not 


that 




ver- 


jhere was only one — or strike that. 

a Medline search would reveal that only one article 
jpropensity scores? 

idea whether that's — whether that's true or not. If 
you r^plfcsent to me that it's true, I have no reason to — to doubt it. But 
again^^t just suggests to me that it's not very complete. 

Q# Are you familiar with the Journal of the American Statistical Association 
CurrelniPlndex? 

A|^l%Vs a Current Index, not really a journal. A publication, yes. 

that's basically a listing of articles — 

A. Correct. 

Q. — regarding the — regarding the subject matter of statistics; right? 

A. Correct. 

Q. And do you know whether a search of the current index would reveal that 
in 1996 there were two articles published about propensity scores? * 

A. I will believe you if you represent that to me. I have no reason to doubt 
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C "\ it. In the statistics literature, that's correct, I -- I'll believe that. 

Q. Would you or -- 

Are you aware that a search of the current index would reveal that there 
were over 290 articles published on regression in 19 — in 1996? 

A. It would not surprise me at all. 

Q. Are you aware that a search of the current index would reveal that ther* 
werSr 14Articles published about multiple imputation in 1996? 

I didn't know it was that high. 

Jj^^Surprising; isn't it? 

Il^^y don ‘ t know. I said I didn't realize it was that -- that -- that high, 
thatr*“S"«Hl.l. I didn't know that. I mean multiple imputation is very your.c. 
Regression has been around since the 18th century. 

™ir. Wecker, IK^ ant^ — "Dr. Wecker," I'm sorry. 

|H$tmlubin, I wan|g|jj|®§|show you the exhibit that you prepared regarding the 
var ious databases. IFtti not sure exactly where it is. 
f re l WPfa 1 s missing. rail's part of the display. 


plii^ubin, I wan. 
var ious d atabases. II 
s missing. 
Q. W's missing, 
have to m 
^^fc QURT: The w] 
4 KR" HAMLIN : Yes, I 
(Laughter.) 
TfetHTTIjrg^ ■ You! 
BY MR ^A WLIN: 

Q^^^ .1 rit ffTrj Dr! 

if do flVff. i 


!Tj|ly impute it. 

^jPthing should be in blue there. 
isM we go. 


or, thank you. 


QP”?%1 ri^^y Dri^^Sin, this is going to be a little awkward, but let's sei 
gflBBBBjfe st because size of the exhibit. 

A acLa f you can't^^^ portion of it, let me know; perhaps we'll have you 
come dawn — Ls&s. 


me dawn — 

AQ#ay. igfflj 

Qm-t to talk ab p'£.r t. 

P-.Z^kzr .'s fair, r ^\ 

Q^^want to tal^pg|^bt the NMES database with supplement. All right? 

W$?~jjlER5TEKER: to^^ onor, I just wanted to advise the witness if he can' 
see ieWlt’s at tab JrSrTre his book. 

Tt#%ITNESS: Oh. 

T&®$£OURT: Okay. 

Mi ^, B IERSTEKER: Thank you. 

TltlFloURT: Do you want to come up here, counsel? 

MtfP%IERSTEKER: I’m fine. 

Tfftefe^TNESS: Do I have the wrong book? 

THE COURT: Wrong book I think. 

THE WITNESS: Is this the one? 

THE COURT: I think it would be right here, sir. 

THE WITNESS: Oh, thank you. Thank you, Your Honor. 

A. Yes. 

Q. Okay. So I want to ask you about the first database that's listed here, 
t Copr. © West 1998 No Claim to Orig. U.S. Govt. Works 
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i NME5 with supplement. 

A. Yes. 

Q. As I understand the line, the blue boxes represent missing information; 
is that right? 

*12 A. That's correct. 

S ~p. So let's talk about that first box, physician diagnosis of medical 
condition. See that? 
s, I do. 

w you agree that NMES has evidence of medical conditions; correct? 
s, I believe it's self-report. 

ght. If we go to diagnosis based on self-report, that is one source o: 
med Sa l^conditions; right? . 

^>HpT,t ' s one sougb e/of information about medical conditions, that's correct. 
p PW ^kay. And th&ggP^another source, the self-report of physician 
ccm rog-lca tion on medical conditions; right? 

WP^ftat' s correclE^®*-'^ 

Q . Jjh d that come teliiad im the NMES questionnaire, specifically the question 
whi<|^^gked whether jjam|pne was told by a doctor that they had -- 

Q. — various dipufgjds; correct? 

Aj^Correct. ^ 

Qyp&aye you seerf^P^NMES questionnaire? 

A^Lj, lav have . bjutfjyydon 1 1 have a specific recollection. 

dT~~|mt y°^Sfcal F/O Tat the re was a Question where a person was asked wheths 
a hat^ tl^^^^ey had cancer or some other — 

disease s uch as stroke; right? 

A^^s, right. 

Q^cycay. So there^pially are two sources, additional sources of medical 
condi^biis; right? I E-J j 


wPfNMES questionnaire? 

jydon’t have a specific recollection. 

■hit there was a question where a person was asked wheths 
^t!|ey had cancer or some other — 

stroke; right? 

&«iiy are two sources, additional sources of medical 


A # C orrect. The r|ss 0 L& 5 'self-report medical conditions, correct. 

Ql^S#kay. So it i,sn't|that NMES is lacking information on medical conGi^Lcns 


right! 

A 


? 

provit 

ol 


s that's right. It's not lacking all information on 
flacking physician-reported information on medical 


A psssBi ♦ s -- it's g--* 

f as it's labeScff that's right. It's not lacking all information on 
onditions, lacking physician-reported information on medical 

1 right. 

's not necessary — 

Q lLxf you go to the next box here, which is labeled amount reimbursed to 
provj igger, it indicates that that information is missing; right? 

A jf~Tfoa t 1 s correct. 

QlffliPI the provider is a doctor or a hospital; right? 

A. That's correct. 

Q. And what you're saying is that the amount, how much the doctor was paid, 
that's missing. 

A. That's correct. 

Q. Okay. But you agree that in fact NMES has information on how much 
providers received; right? 
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c 


A. Again, it's not the same kind of information. But that's correct, it has 
some information. 

Q. Yeah. I mean if you go to the box marked total medical expenditure frcrr. 
provider or household survey, there's information in there about how much the 
provider received; right? 

e ah, that’s — you know, that's self — yeah, it's — 
alf missing. That's correct, 
t that’s — 

But there's something there. 
pgj^&g& ere is definitely information there; right? 

f&TTnat's right. It's not the same as the — as record information. 

the title is a little difficult to understand. You say 
from provider or household survey. But what you near. 
id the provider or the household — or in the household 
le medical expenditures were; right? 



11, you kno 
ical expend 
MES actuall 
the — w 
s correc 
cause there 
. Yes. 

mean we tal 
That's correc 
kay. So that 
not the 
re 

as wh&uiRS 
is sej^ ^S or 



separate provider survey done by NMES; right? 
that earlier; right? 


we =0X6 


one form o 
s. We have 
AJ|1 right. An 
indicates t 
t's what 





mode 



)OU 


mation is in there, 
nformation. I'm — I — I've done studies of — of 
eport expenditures, self-report conditions, are net 
ecords of or -- or doctors have records of. Self- 
it's different. 

information in here about what providers were paid; 


her we got it in there. 

orm of information, that's correct. 

the third box is self-report stay at nursir.c home, 
at information is missing; right? 
icates, yes. 

NMES in their nursing home 


Q.^StinL right. Nov|g&§g*g|plaintiffs didn't use 

-- ht? fnri 

t's correc 

intiffs used NHANES, a different data set; right? 
t's correct. 

for purposes of the model, in effect all of the boxes, all of the 
MES in fact contain information because there is information about 
diagnosygjjj and medical condition, and information about what providers received; 
right?CZj^ 

A. ^Tnere’s not physician diagnosis of medical condition, there's not — it's 
not the same kind of information. It's — it's self-report. That's available, 
that's correct. 

Q. Yeah. I mean so there is expenditure information, there is medical 
information, it's just a different type of information; right? 


A. That's correct. 
Q. Right. Okay. 


It's a different type of information. 
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\ Okay. Now let's go down to the billing and enrollment data. 

A. Okay. 

Q. All right? This is the Medicaid claims data you're referring to; right' 
A. Correct. 

Q. Okay. That’s the entire line across; right? 

^es. 

fQ. Ntow there there is information about physician diagnosis of medical 
; right ? 

„A. Correct. 

^^^nd in the claims information there is information about the amount 
reiigKfrffed to the provider; right? 

TL. That * s correct. 

$2%lsp it makes oy the information that's lacking in NMES. 


it makes lip p! oy 
? 

. . rVi 


Q 

inf on 
reimb 


^1 f act, ra!S8S t ^ ie physician diagnosis and the amount reimbursed ar^ 
the Minr r^og Medicaid population; aren’t they? 

A^That questlor ^^j 'a not sure I understood the previous question. 

QS^'g asking yo^pif question, and that is: Isn’t it true that the 
-^Siion ih>;-Xbe p^^Uian diagnosis of medical condition and the amount 
to ^r^ vid^F -h the Medicaid data is specific to Minnesota? 

' s Wff recp^ 

Q. Okaythep^^also a line for Blue Cross Blue Shield; right? 


formation about the amount reimbursed to provider, 
ta; right? 


i Q- Okay.^lyei^the p BEM li also a line for Blue Cross Blue Shield; right? 

p p ga # ) rrect. = 

this agaii^ ^^ ers to the claims data; right? 

a g ain » thpCjs information about physician diagnosis of medical 
condi^aioia in the Blu e Cro ss data; right? 

A^s&gp&rrect. ' j 

Q«^d that’s spkg^c to Minnesota; correct? 

Correct. 

there’s a |nPjW!information about the amount reimbursed to provider. 
That^^ csp ecific to MtaJUdota; right? 

A^Rrrert 

me as ^ ^ ou ^is: ^ ou if you go to the billing and enrollment 

Medic^^r data, you have N as unknown. In fact all you have to do is look at rh 
claim ^.j^da ta and you can find out the number of claims; can’t you? 

A rfT" presume — I presume so. This was -- 

an alysis was based on the information that was provided to me and I 
providHBS^iaj my assistant through plaintiffs delivering it to attorneys, and — 
and I believe from that information it could not be determined. 

Q. So you haven’t actually gone in and looked at the claims data and 
determined the number of claims in Minnesota — 

A. That’s correct. 

Q. — on the Medicaid data; correct? 

A. That’s correct. 
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o 


was 


Q. And you haven't done that for the Blue Cross data either; correct? 

A. That's correct. 

Q. So that's why you put unknown down here; right? 

A. It was unknown from the information provided to us through the attorneys 
that came from plaintiffs. 

11, do you know whether or not the claims data for the Medicaid prorre: 
ced to the defendants in this case? 

11 some — some was. 
you know if all of it was? 
in fact -- 
't know. 

you know if all the Blue Cross data was produced to the defendants ir. 

? 

, I do not. 

t you only deceived some; right? 

me" impliefrijJ&aow that I only received some and not all. I do not 
ay have rec a^ffi al all, I may have received — received some. I do 


thi 


know. 

knowj 




r.oc 


it in any ca 
I 'did not tak 
assistant do the com] 
oCjife&w let me as 
to go %e r to J^^ cat 


o ’joy e 

Afz^ s - ^.-■ 

j gy^rro w it Tnaxca 
A. In th#^^e C 
think — oo 
ere’s no — 
get my colors 
ay. 

ght. B1 




Qj&A\ 1 ric 
A.^^fght. 



ay. 

's awash in 
u're indica 
jtabase; is 
at's what t 



didn't review it; right? 

databases myself and look at them personally. I had ar. 
for me, that's correct. . 

about the billing and enrollment data line, and I want 
11 right? 

that there is information about education; correct? 
lue Shield, that's what it indicates. Is that right? 
rry. It indicates there's no information. 

ed up here. 

ans there isn’t any information; right? 


$ 


of blue. 

:here that there's no education data in the Medicaid 
right? 

idicates, that's correct. 

ly. Well are you aware that the plaintiffs in this case actually took 
information from the client index file, which is the daily ledger kept 
iicaid program by the state agency, and put that in their model? 
the — in the billing and enrollment Medicaid data, they put that in 
is that the question? 

|rou want the question again? 

A. Yes, I guess I better have it again. I’m sorry. 

Q. Sir, are you aware that plaintiffs took from the client index file 
education information — and the client index file, just for your information, 
is the ledger kept by the state of Minnesota regarding the Medicaid program. I'm 
just asking you if you're aware of that fact. 

*15 A. I don't think so. 
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Q. Are you aware that they took that information and put it into the 
plaintiffs’ model? I'm asking you if you're aware of that fact. 

A. No, 1 don't think so. I — I don't think I'm aware of that. 

Q. Okay. 

A. I was trying to figure out where they would have put it in the model. 

Now as I understand it, you didn't personally look at the computer 
information that the plaintiffs provided, so you personally wouldn't know 
whefifej|||^it was in there; right? 

That's correct. 

|CLggg||kay. Let me ask you this: With respect to the information in the 
Beh|vriaral Risk Factor Survey, you indicate that there is information for race; 
is g&at^right? 

ind icates, yes. 

the plaintiffs did use race information in their model; 




pub 



iat's what 
kay. And in 


raci st? _ 

vat's corre 
cay. And ar 
id people, 
is. 

*Q. Sb it's not 
about^whether someo 
moderg^jncome; ri 

A ^lYe l, I am., aw 

WrS® are yurr; awa. 
used multiplePiSlputa| 
I heard 
you aware' 
A^Yes, I heard 
Qlyyp that troubl 
Ap N|>t necessari 
would you 
use o ammfc r both of t 
study Lfeglyalid? 


pM H Wfes, 




aware, sir, that NMES includes information net only or. 
so on low-income people? 

hether someone's on public aid, they also ask questions 
a high income or a low income or a moderate -- 

that. 

of Dr. Wecker’s testimony yesterday that he has never 


is testimony that he's never used propensity scores? 
too. 

to you, sir? 

with his testimony that simply because a study didn't 
techniques, that in and of itself would not make a 


A 

Q 

exper 

A 

I cer 

Q 



at's correct 
ght? 

I . 4 onwmrnrmr 

ve said the r sarne thing, 
ay. Are you aware that Dr. Brian McCall, one of the defendants' 
has also said that he doesn't use propensity scores? 

think I read that, but I can't recall the specific reference. But I -- 
ly would not doubt that representation. 

e you aware that Dr. McCall has also said that he's never used multiple 


imputi 

A. I don’t remember specifically, but I wouldn't doubt that either. I — I 
accept the representation. 

Q. And do you find that troubling, sir? 

A. Not necessarily. 

Q. Now as we've seen from some of the literature, statisticians have 
complained about the difficulty and the complexity of some of your statistical 
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j theories. Let me ask you this: You told us about William Cochran, who was ycur 
advisor at Harvard and a very distinguished statistician; correct? 

A. Correct. 

Q. In fact he was probably one of the leading statisticians of this century; 
right? 

would agree with that. 

mean you’re proud to call him your mentor; right? 
solutely. 

et me direct your attention to Trial Exhibit 8910, which was designated 
fendants. Sorry, AZ — is it AZ? Sorry. AZ008910. 

00 -- 

was designated by.the defendants, so it would be in their book. 

HE COURT: I^Nhas, tab numbers on it. 

provide you a tab number in a moment. It’s velobound. 


by 



MR. 


[IERSTEKER 
prof essor 

CTNESS: Parc 
[ERSTEKER: j 
fITNESS: Oh, ^o 
Iamlin: Yeah 
THE WITNESS: I a 
MR. HAMLIN: Ten. 
TKg|ttITNESS: 891 
MI ^LB KRS TEKER : 

2iO AMLI lIir u 

MR. HAMLfJ^Nou 
tlTNESS: Yes 
^MLIN: All 
1LIN: 

you turn 

A s. 

Q 1 right. Thf 

.ions to the 
*s correc 
d this was 




BY MR 




ne of the velobound pieces. 

ze. Eight nine — 

bably? 

correct. 
e -- 

e that. 



e 37. 

an article by you entitled "William G. Cochran’s 
n, Analysis, and Evaluation of Observational Studies." 

hed while you were at the University of Chicago; 

at' s correctJTHffell I wrote it while I was at the University of Chicago, 
re when it was published, 
where was it published? 

is was a book that was in honor of him and his contributions that was 
two of his students who asked other of his students to contribute to 

if I told you that it was copyrighted 1984, would that — 

A. I’d believe that. 

Q. — be right? 

A. Yeah, I think that's right. I'd believe that. 

Q. All right. 

MR. HAMLIN: Your Honor, we'd move for the admission of AZ8910. 

MR. BIERSTEKER: No objection. Your Honor. 
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) THE COURT: Court will receive AZ8910. 

3Y MR. HAMLIN: 

Q. Sir, let me direct your attention -- 
Well let's -- let's put the title page on 


the overhead first. So the title 


the 

the 


is "William G. Cochran's Contributions to the Design, Analysis, and Evaluation 
of Observational Studies,” and you were one of his students and you wrote this 
article j.n honor of him; right? 
j %»^ fcrrect. 

afi. Okay. Let me direct your attention to page 39. 

||fc|ggg£ikay. I'm there. 

ggrr* want to direct your attention to the first full paragraph. It states, 
"The^ylsenate objective of Cochran's statistical research on observational 

;as to provioa kth e investigator with reliable statistical tools and sag* 
their use.yi^ore than one occasion Bill told me that it was better 
^he applied researcher a reliable tool that is well understood and will 

ore powerful, but complex and potentially misur.dersroc: 
misapplied. As expected given such an orientation, 
no a priori favorite methods, good practice being r.ori 
cal orientation.” 
t you, sir? 
till believe it. 

. No further questions. 

Honor, there's one article I would probably like to uss 
: ly happy to move it now, though, and in the presence oz 
preference. 

ou just use the Elmo there, 
e record.) 

IPre you finished with that? 

Honor. 

4*fe lay it down or something, 
on't we just tip it over. 



:operly tha 
can be ea 
;eems to hav 1 
than philo 
'You wrote that; 
A^I wrote it an 
M&SlSWMLIN: 

*ERS 
f or. 

It 1 

THE C OURf 
jP||9$8t|D i s cu s s i on 
Tfi^toURT: Couns 
M m JAMLIN: Yes, 
TfTOCOURT: Why d 
MEf? &IERSTEKER: 





bother me, but it does block other people, 
aid down.) 


TfflTOOURT: It doesn' 

(EpP&L collapsed 
BY MR.p8$ERSTEKER: 

^ofessor, is rmf|TiSi.ple imputation the only valid way to impute missing 

,'s not the onlyvalid way to — to handle missing data. There are othe: 
handling missing data that — that can — can be valid. 

>u happen to think that multiple imputation, however, is a good 



& 


xhink, in fact for precisely this — this reason that was just quoted 
*ran article, that is the technique to use when dealing with item non- 
scause it — although it takes effort to create the imputations, it's 
transparent to users and it is very automatic and very reliable. 

Q. Did plaintiffs use any valid method to impute the data that were missing 
in their data sets? 

A. No, they did not. 

Q. Is propensity scoring the only valid way to adjust for collection of 
background factors? 

! 
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? A. No, it's not. But again, if you follow this advice of Cochran, it makes 
comparisons transparent by revealing what's going on, whereas complicated 
regression models and probit models and maximum likelihood are not transparent 
and can deceive people. 

Q. In some circumstances regression analyses can reliably adjust for 
background factors; can't they? 

ey can. There are circumstances where they can. 

those circumstances exist in the data that the plaintiffs used? 

, they do not. 

AMLIN: Objection — wait. Objection, leading. 

OURT: Well it is leading, but I'll allow it. 

IERSTEKER: Thank you. Your Honor. 

, they do nbt ^. t 

ofessor, ifypii|could turn to Trial Exhibit ASP000041, which is at tab 
r book. 

I'm therfn^T^ 

this was Mts #jj ticle that was published in Johns Hopkins University's 
Journal of ^^^Mniology; is that right? 
s. That's that article. 

0 r^fog ^it is wricpiiiWf/ Sander Greenland and William Finkei/ is that right? 

a sm' s 

e SarlplT GrJUrfilid — Greenland and William Finkei reliable authorities 
eld m . 


10 


Arne 








§ 


in the f ieBi^iW epidemiology? 

A.^^T\know Sander auj lBtee well, and he's very highly respected. I don’t know 
willi ^ ^linkel as we^EIsb I — I -- but I assume he is. 

Qj»Daes this art^pl^,, form part of the basis of your opinions in this case? 
A.^grs, it does. ” j 

Ml^ rtfe lERSTEKER: i^HBB& i -d move the admission. Your Honor, of ASP000041 as a 
learnesUsireatise. t | 

1 LIN: No Your Honor. 

)CJRT: Court te|yy receive ASP000041. 

BY MR.’ i WlERSTEKER: 

we could focus please on the abstract of the article, I'd like to read 
a few sentences from that first. Starting with the very first sentence, doctor, 
it say#^"Epidemiologic studies often encounter missing covariate values.'' Do 
you 5 eirx'h at? 

♦1 ^^ . Yes. 

Q.pfiii^then it goes on, sentence a little further down, "The method...." See 

it? 

A. Yes. 

0. Could you — could you read that sentence, please. 

A. "The method based on missing-data indicates — indicators can exhibit 
severe bias even when their data are missing completely at random, and 
regression (conditional mean) imputation can be inordinantly sensitive to model 
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mis specification.” 

Q. Professor, was — did — 

Did plaintiffs use missing data indicators to impute missing values — 

A. Yes, they did. 

Q. — in the data set? 

Then it goes on to say — and if you skip one sentence, move to the next 
one,^"Mdj:e sophisticated...." Would you read that one, sir. 

{a. "J lore sophisticated methods, such as maximum likelihood, multiple 
impuTliSlifen, and weighted estimating equations, have been given extensive 
attesrtion in the statistics literature. While these methods are superior to 
simnliSSti^thods, they are not commonly used in epidemiology, no doubt due to 
theifac. complexity and lack of packaged software to apply these methods." 

. at 1 right. And then the last sentence starting "In general...," it reads, 
"In £3^^al, the authoWreconmend that empideroiologists avoid using the 
misslrnfplndicator meflfod^ — that's the method that the plaintiffs used in seme 
instbp aaBBfe — "and us«>-ino»e sophisticated methods” — 

THEUbURT: Couns IllFr! 

fc&L jS jilERSTEKER: fsiPlfcrry. I shouldn't have done it. I apologize. 


"In fa ef* • .a) - the aut 
missSrf^lndicator me 
inst hyragafe, — "and us 
THE©bURT: Couns 
^lERSTEKER: 
jS||l£OURT: Yes. 
PWlERSTEKER: 
THE COURT: Just 
Q.K J J 11 just rea 
empide|S^^)gists avo 
sophisticated pBS^iod 
£i£^sor,ijM th 


It' s 


j lo a v v 

&mfriod 
. th 

/■'don ’ t 


^.ogize. 
trie question. 

last sentence. "In general, the authors recommend that 
sing the missing- indicator method and use more 
^iever a large proportion of data is missing." 
onsistent or inconsistent with your opinion? 

M h my opinion. 

rn, also, then, to page 1263. 


tond full par 
liltiple impute 

<e regressorsF 
|to program.K 
ssor Rubin, sL 
|s — it’s -P 
Ited at doin|| 
Models that W 
pfessor, I bP 


ph under "Discussion" reads — the first sentence, 

>n based on a multivariate normal model for the 
horoughly described in Rubin's textbook, and is not 


i difficult to program multiple imputation methods? 
s not difficult for people who are modestly 
kind of computing that's involved in running the 
; running. 

je you testified that you were one of the creators or 


r of multiple imputation in the 1970s and that you co-created 


Q.wfcond full parsr^ph under "Discussion" reads — tne first sentence, 
quote,^fjjaltiple impu^i^n based on a multivariate normal model for the 
incomprl^e regressorspri^horoughly described in Rubin's textbook, and is not 
diffieftoto program. 

Processor Rubin, syu^y? difficult to program multiple imputation methods? 

A.PWfs -- it’s not difficult for people who are modestly 

sophisl^Nlted at doin bpffl i^ kind of computing that's involved in running the 
kind^pB^Rodels that weTri running. 

Q. ^pfessor, I you testified that you were one of the creators or 

the cr<yg$pr of multiple imputation in the 1970s and that you co-created 
propensZEy scores in the 1980s. We discussed earlier in your direct testimony 
your rMeipt of the Wilkes award from the American Statistical Association. When 
did yo ums^ ceive that award? 

A. J^selieve it was 1995, I believe. 

Q. |An|L^/ere your achievements that formed the basis of your receiving that 
award ^fllithe American Statistical Association specifically cited at the time 
that you received it? 

*19 A. Yes, they were. 

Q. And what achievements were included in the list of your achievements for 
which you got that award? 

A. Included were methods for handling missing data, such as multiple 
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imputation, and methods for obtaining causal inferences and observational 
studies, including propensity scoring methods. 

Q. You’ve received the Parzen Prize for Statistical Innovation; haven't you? 
A. Yes, I have. 

Q. When you received that award, was there a specific reference to your work 
that formed the basis for your receiving it? 
jP*A. yes, there was. 

HAMLIN: Objection, Your Honor. I think this is beyond the scope of the 

cros^Pr 

fjiy-yy COURT; Well I guess I'll allow this, but we’re starting to get beyond 

^MR?JbIERSTEKER: I’ll ask one further question. Your Honor. 

:"V#COURT: All'Kght. 

BY k^t|IERSTEKER: 

;Q. were — werePyour -- 

he award ot^m^ensity scores and missing data and causal inference 
cited wiffen you rece|^|^^:hat award? 

es, they — ’yesTj they were. 

BIERSTEKER: nothing further. Your Honor. 

HAMLIN: I hJpg^pthing further. 

T HE COURT: You r fays^ ep down. We'll take a short recess, 
m&pess taken.) pPSU&f 

'i fnjT ^.ERK^Jg^jll Court is again in session. 

(|$l¥y ent< fi^; thePcmretroom.) 

^^^CL£RKl^NLeas|P%5e seated. 

MR. GARMtpn^ ''ioxM^sgkor, the defendants call as their last witness Dr. 3riar. 

i i. 

ILERK: Dr. if you could please stand, raise your right hand, 

less sworn.) 

ILERK: PleasglS^te your name and spell your last name. 
fITNESS: Briajj^J-’atrick McCall, M-c-capital-C-a-1-1. 
iLERK: Thank' you^ Please have a seat. 

P. McCALL l^it^fd as a witness, being first duly sworn, was examined 
^fied as folEwsTy 
^RNICK: 

^od afternoo &y4o^ . McCall. 

>od afternoo^®^ 

Q^Sfeere do you live? 

A.^^lrdon? 

Q# Where do vou live? 

A.P [DELETED] 

Q#[DELETED] 

A.pMMp 
Q. What is your profession? 

A. I teach over in the Carlson School of Management in the Human Resource 
and Industrial Relations Department at the University of Minnesota. 

Q. And could you briefly tell the jury, briefly describe for the jury your 
educational background. 

A. Well I'll start since after high school. In, I don't know, 1981, I think, 
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I 

i 


t 

l 

i 

i 


i 


i 


i 

( 

| 


i 


.t 

ft 



n, we may often with to apply the power* 
ful techniques for compering and combin¬ 
ing results known as meta-anatytic proce¬ 
dures (e-g,. Cooper, 1989; Glut, MeGaw, Sc 
Smith, 1951; Hedges A OUdn, 1985; 
Ro s ent h al 1987; 1991). Combining the 
results is straightforward; we simply ever* 
age the obtained as with air appropriate 
weighting (Rosenthal 1991). Comparing 
the results can be by means of either diffuse 
or focused tests. 




{.7o-.sen* (■o-.scsi* cos^-.s^ 1 
.oo« . 02 is ,0162 


= 17.77, P-.00014 


Therefore, we can conclude that the results 
of these three (arbitrarily selected) studies 
differ significant!y-among themselves, 


Focused Tests 


Far more informative than the diffuse 


Diffuse Tests 



entity. These tests assess 
gnifkanct of the 
obtained its using the 
mAdf, where m is the 
lent studies: 

-n2 

( 8 ) 


(9> 


(IP) 


lest. Rom a recent 
of 28 guizfald 
R ose n thal 1968; 
Honorton, H965r Hyman, 198S) Che results 
of ft> first jbhree w ed studies are shown in 
Tables. RfiE^giftudjr wa And the number 
of trials ( Mg^^BS imher of eo rree t guesses 
(hits), P (hKs/fnau), and l the number of 
stimuli fro teSitfatth the correct one seas to 
be aelectomAleostiowrn an K ISE^j*. and 
« as obtained from equations (Z), (4), and 
(10) r e s p ec tively. From equation (9) wo find 

it as follows: 


- {17J.«X-W)*<4A.4X.tt)-*-(tft.3)(.ae) 

* * — 111 ... 

171.4 ♦ 44.4 ♦ 41.5 


*.565. 


To test the heterogeneity of these fives 
studies we use equation (8) to find: * 


or omnibus tests an the focused tests, or 
contrasts, that address quite specific 
r esear ch que st io n s (Rosenthal 4c Rosnow, 
1985; 1991). Any hypothesis about what 
features of a study might be significantly 
related to the obtained effect size, K , can 
be tested by the following t statistic 

Zm ~ r *y»- T <*« 

nse^n ... - •< 

where the ft} are contrast weights, which 
sum to zero (eg, for four studies ordered 
by mean age of subjects, *,*-3, 

Xjti+1, V,=+3 represent a Unaar trend in 
age, and ft-*-l, Xj«+1, ft$*+J, l^e-l repre¬ 
sent a quadratic bund in age). 

Ex*mj>U of • facetted test. Suppose we 
had hypothesised fiat In gaiafeld experi¬ 
ments, subjects would pe rf o rm rslattvtly 
better when the task was lees ‘complicated’, 
for example when K the number of altema- 
lives from winch one was to be selected, 
wasjsmaSlar nt)w than larger. Tebie 5 
shows flat five three studies emaciated with 
Is of 4, 6, and 5, The cor r e ct eanbaet 
weights or Is r epresenti ng the variable of 
complexity (defined by ft) can be u l iteli ia d 
by subtracting the mean value of ft (5 JO) 
from each of the three to. The resulting Is 
are-1, +1, and 0, respectively. Since t h e se X 
< cum «> coo they are proper c on t ra st 
weights. 

* 

Applying equation (It) wo And; 


1 
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TabU 5 

ftuullt of three gemftld experiments 


* J>=Hfot/trUk 

^ For pu >y w of computing tha ttuvdwd anot, P* of .00 and 1.00 m adjuttad by adding the 
eomctiai».5/(N+l) to .00 or subtracting it from 1.00. TCurefoni'is adputsd to .03, and n ia 
adjusted to .09. 



<-!)(■ 70> 4 
/(!)(-0056) + 


irtSSL 


-0.12,1 




.1676 


This result pr 
*• contrast with t 
tudies ^aS^nd 3 

on of thfeapuio 


a«- 1 

WpMftudiea 1 and 2 ongjg 
Wad fr on the other. Tha cl 
M ^comparison are +1. 
^#ield the following^ 


4(0)(.09) 

+ (0)(.0162) 
vs-tailed. 


little support far 
M, 41, and 0 for 
idvaly. Bxamlns- 
In Table 5 shows 
ation among the 


i band and study 
t weights for that 
nd -2 and these 


<-rl)<.70) +(+!)(.] 


mbi h I 

T5 



■8) 4 (—2) (.09) 
25)+(<)(. 0162) 


O *-£00041, ona-tsiled, 

ft * Since s^aajfHl), we Rod that the 
^Hfa sodatad with Ms contrast s 1532, which 
^®|ccounts far 1532/1T27 = 37 of the overall 
pp|| X 2 for heterogeneity. Moat of the vari¬ 
ation among tha three ns, therefore, is amo- 
dated with this contrast Since (hue throe 
studies ware se l ected arbitrarily, we should 
not of course, attach any scientific meaning 


Study 

M trials) 

Hits 

JP° 

k 

n 

(SEn )* 

tv 

1 

32 

14 

.44 

4 

.70 

.0056 

176.6 

2 

10 

3 - 

30. 

6 

M 

—0225 

44.4 

3 

14 

0 

.03* 

5 

.09 

.0162 

613 


fa these results, which served only es a 
numerical Phatratm Also, since tha con¬ 
trast weights 41, 41, and -2 were selected 
after examination of the results, the associ¬ 
ated p-value will be too liberal. Indicating 
significance too often. 

Coefficient of robustness of replication 

In our earlier discussion of the hetero¬ 
geneity of a sat of ns, our evaluation of 
heferegeadty was in terms of significance 
tasting. Although such tost* have some util¬ 
ity, they suffer from the pro blem 
tat two identical sets of effect sizes may 
differ dramatically in the significance of 
their he te rogeneity tests If their sample 
sizes (&£« N of trials) differ appreciably. It 
is, there f or e, oftan informative fa employ 
an Index of heterogeneity tat is 
independent of sample size, far example, 
the root mean square (S^) of ta obtained 

ns. The coefficient of rdbustass of 
replication Is defined simply as ta mean n 
- JBO divided by the Sg or ta reciprocal of 

ta coefficient of variation (Rosenthal, 
1990), or 

C -“— 

This co efficient is particularly useful for 
compering ta findings from two or more 
/■search araaa for hdr rohiiliwii, adjust- 
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o 

CL 


tag; for differences in the number of studies 
or trials in etch research area. 

The utility of this coefficient is based 

on two idea* - first, that replication success, 

clarity, or robustness depends on the 
homogeneity of the obtained effect sizes, 
and second, that it depends also on the 
ununMguity or clarity of tha directionality 
of the results. Thus, a set of replications 
grows in robustness as the variance of the 
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